Pub Date : 2023-04-01DOI: 10.1016/j.gpb.2022.08.004
Zhihang Chen , Ziwei Luo , Di Zhang , Huiqin Li , Xuefei Liu , Kaiyu Zhu , Hongwan Zhang , Zongping Wang , Penghui Zhou , Jian Ren , An Zhao , Zhixiang Zuo
Immunotherapy is a promising cancer treatment method; however, only a few patients benefit from it. The development of new immunotherapy strategies and effective biomarkers of response and resistance is urgently needed. Recently, high-throughput bulk and single-cell gene expression profiling technologies have generated valuable resources. However, these resources are not well organized and systematic analysis is difficult. Here, we present TIGER, a tumor immunotherapy gene expression resource, which contains bulk transcriptome data of 1508 tumor samples with clinical immunotherapy outcomes and 11,057 tumor/normal samples without clinical immunotherapy outcomes, as well as single-cell transcriptome data of 2,116,945 immune cells from 655 samples. TIGER provides many useful modules for analyzing collected and user-provided data. Using the resource in TIGER, we identified a tumor-enriched subset of CD4+ T cells. Patients with melanoma with a higher signature score of this subset have a significantly better response and survival under immunotherapy. We believe that TIGER will be helpful in understanding anti-tumor immunity mechanisms and discovering effective biomarkers. TIGER is freely accessible at http://tiger.canceromics.org/.
{"title":"TIGER: A Web Portal of Tumor Immunotherapy Gene Expression Resource","authors":"Zhihang Chen , Ziwei Luo , Di Zhang , Huiqin Li , Xuefei Liu , Kaiyu Zhu , Hongwan Zhang , Zongping Wang , Penghui Zhou , Jian Ren , An Zhao , Zhixiang Zuo","doi":"10.1016/j.gpb.2022.08.004","DOIUrl":"10.1016/j.gpb.2022.08.004","url":null,"abstract":"<div><p><strong>Immunotherapy</strong> is a promising cancer treatment method; however, only a few patients benefit from it. The development of new immunotherapy strategies and effective <strong>biomarkers</strong> of response and resistance is urgently needed. Recently, high-throughput bulk and single-cell <strong>gene expression</strong> profiling technologies have generated valuable resources. However, these resources are not well organized and systematic analysis is difficult. Here, we present TIGER, a tumor immunotherapy gene expression resource, which contains bulk transcriptome data of 1508 tumor samples with clinical immunotherapy outcomes and 11,057 tumor/normal samples without clinical immunotherapy outcomes, as well as single-cell transcriptome data of 2,116,945 immune cells from 655 samples. TIGER provides many useful modules for analyzing collected and user-provided data. Using the resource in TIGER, we identified a tumor-enriched subset of CD4<sup>+</sup> T cells. Patients with melanoma with a higher signature score of this subset have a significantly better response and survival under immunotherapy. We believe that TIGER will be helpful in understanding anti-tumor immunity mechanisms and discovering effective biomarkers. TIGER is freely accessible at <span>http://tiger.canceromics.org/</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 2","pages":"Pages 337-348"},"PeriodicalIF":9.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10410423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.05.007
Waleed Iqbal , Wanding Zhou
Dissecting intercellular epigenetic differences is key to understanding tissue heterogeneity. Recent advances in single-cell DNA methylome profiling have presented opportunities to resolve this heterogeneity at the maximum resolution. While these advances enable us to explore frontiers of chromatin biology and better understand cell lineage relationships, they pose new challenges in data processing and interpretation. This review surveys the current state of computational tools developed for single-cell DNA methylome data analysis. We discuss critical components of single-cell DNA methylome data analysis, including data preprocessing, quality control, imputation, dimensionality reduction, cell clustering, supervised cell annotation, cell lineage reconstruction, gene activity scoring, and integration with transcriptome data. We also highlight unique aspects of single-cell DNA methylome data analysis and discuss how techniques common to other single-cell omics data analyses can be adapted to analyze DNA methylomes. Finally, we discuss existing challenges and opportunities for future development.
{"title":"Computational Methods for Single-cell DNA Methylome Analysis","authors":"Waleed Iqbal , Wanding Zhou","doi":"10.1016/j.gpb.2022.05.007","DOIUrl":"10.1016/j.gpb.2022.05.007","url":null,"abstract":"<div><p>Dissecting intercellular epigenetic differences is key to understanding tissue heterogeneity. Recent advances in single-cell DNA methylome profiling have presented opportunities to resolve this heterogeneity at the maximum resolution. While these advances enable us to explore frontiers of chromatin biology and better understand cell lineage relationships, they pose new challenges in data processing and interpretation. This review surveys the current state of <strong>computational tools</strong> developed for single-cell DNA methylome data analysis. We discuss critical components of single-cell DNA methylome data analysis, including data preprocessing, quality control, imputation, dimensionality reduction, cell clustering, supervised cell annotation, cell lineage reconstruction, gene activity scoring, and integration with transcriptome data. We also highlight unique aspects of single-cell DNA methylome data analysis and discuss how techniques common to other single-cell omics data analyses can be adapted to analyze DNA methylomes. Finally, we discuss existing challenges and opportunities for future development.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 48-66"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9939249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.02.006
Chao Li , Wen Chu , Rafaqat Ali Gill , Shifei Sang , Yuqin Shi , Xuezhi Hu , Yuting Yang , Qamar U. Zaman , Baohong Zhang
The past decade has witnessed a rapid evolution in identifying more versatile clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein (Cas) nucleases and their functional variants, as well as in developing precise CRISPR/Cas-derived genome editors. The programmable and robust features of the genome editors provide an effective RNA-guided platform for fundamental life science research and subsequent applications in diverse scenarios, including biomedical innovation and targeted crop improvement. One of the most essential principles is to guide alterations in genomic sequences or genes in the intended manner without undesired off-target impacts, which strongly depends on the efficiency and specificity of single guide RNA (sgRNA)-directed recognition of targeted DNA sequences. Recent advances in empirical scoring algorithms and machine learning models have facilitated sgRNA design and off-target prediction. In this review, we first briefly introduce the different features of CRISPR/Cas tools that should be taken into consideration to achieve specific purposes. Secondly, we focus on the computer-assisted tools and resources that are widely used in designing sgRNAs and analyzing CRISPR/Cas-induced on- and off-target mutations. Thirdly, we provide insights into the limitations of available computational tools that would help researchers of this field for further optimization. Lastly, we suggest a simple but effective workflow for choosing and applying web-based resources and tools for CRISPR/Cas genome editing.
{"title":"Computational Tools and Resources for CRISPR/Cas Genome Editing","authors":"Chao Li , Wen Chu , Rafaqat Ali Gill , Shifei Sang , Yuqin Shi , Xuezhi Hu , Yuting Yang , Qamar U. Zaman , Baohong Zhang","doi":"10.1016/j.gpb.2022.02.006","DOIUrl":"10.1016/j.gpb.2022.02.006","url":null,"abstract":"<div><p>The past decade has witnessed a rapid evolution in identifying more versatile clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein (Cas) nucleases and their functional variants, as well as in developing precise CRISPR/Cas-derived genome editors. The programmable and robust features of the genome editors provide an effective RNA-guided platform for fundamental life science research and subsequent applications in diverse scenarios, including biomedical innovation and targeted crop improvement. One of the most essential principles is to guide alterations in genomic sequences or genes in the intended manner without undesired off-target impacts, which strongly depends on the <strong>efficiency and specificity</strong> of single guide RNA (<strong>sgRNA</strong>)-directed recognition of targeted DNA sequences. Recent advances in empirical scoring <strong>algorithms</strong> and machine learning models have facilitated sgRNA design and off-target prediction. In this review, we first briefly introduce the different features of CRISPR/Cas tools that should be taken into consideration to achieve specific purposes. Secondly, we focus on the computer-assisted tools and resources that are widely used in designing sgRNAs and analyzing CRISPR/Cas-induced on- and off-target mutations. Thirdly, we provide insights into the limitations of available <strong>computational tools</strong> that would help researchers of this field for further optimization. Lastly, we suggest a simple but effective workflow for choosing and applying web-based resources and tools for CRISPR/Cas <strong>genome editing</strong>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 108-126"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372911/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9884374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.12.005
Gang Chen , Salma Mostafa , Zhaogeng Lu , Ran Du , Jiawen Cui , Yun Wang , Qinggang Liao , Jinkai Lu , Xinyu Mao , Bang Chang , Quan Gan , Li Wang , Zhichao Jia , Xiulian Yang , Yingfang Zhu , Jianbin Yan , Biao Jin
Jasminum sambac (jasmine flower), a world-renowned plant appreciated for its exceptional flower fragrance, is of cultural and economic importance. However, the genetic basis of its fragrance is largely unknown. Here, we present the first de novogenome assembly of J. sambac with 550.12 Mb (scaffold N50 = 40.10 Mb) assembled into 13 pseudochromosomes. Terpene synthase (TPS) genes associated with flower fragrance are considerably amplified in the form of gene clusters through tandem duplications in the genome. Gene clusters within the salicylic acid/benzoic acid/theobromine (SABATH) and benzylalcohol O-acetyltransferase/anthocyanin O-hydroxycinnamoyltransferases/anthranilate N-hydroxycinnamoyl/benzoyltransferase/deacetylvindoline 4-O-acetyltransferase (BAHD) superfamilies were identified to be related to the biosynthesis of phenylpropanoid/benzenoid compounds. Several key genes involved in jasmonate biosynthesis were duplicated, causing an increase in copy numbers. In addition, multi-omics analyses identified various aromatic compounds and many genes involved in fragrance biosynthesis pathways. Furthermore, the roles of JsTPS3 in β-ocimene biosynthesis, as well as JsAOC1 and JsAOS in jasmonic acid biosynthesis, were functionally validated. The genome assembled in this study for J. sambac offers a basic genetic resource for studying floral scent and jasmonate biosynthesis, and provides a foundation for functional genomic research and variety improvements in Jasminum.
{"title":"The Jasmine (Jasminum sambac) Genome Provides Insight into the Biosynthesis of Flower Fragrances and Jasmonates","authors":"Gang Chen , Salma Mostafa , Zhaogeng Lu , Ran Du , Jiawen Cui , Yun Wang , Qinggang Liao , Jinkai Lu , Xinyu Mao , Bang Chang , Quan Gan , Li Wang , Zhichao Jia , Xiulian Yang , Yingfang Zhu , Jianbin Yan , Biao Jin","doi":"10.1016/j.gpb.2022.12.005","DOIUrl":"10.1016/j.gpb.2022.12.005","url":null,"abstract":"<div><p><strong><em>Jasminum sambac</em></strong> (<strong>jasmine flower</strong>), a world-renowned plant appreciated for its exceptional <strong>flower fragrance</strong>, is of cultural and economic importance. However, the genetic basis of its fragrance is largely unknown. Here, we present the first <em>de novo</em> <strong>genome</strong> assembly of <em>J. sambac</em> with 550.12 Mb (scaffold N50 = 40.10 Mb) assembled into 13 pseudochromosomes. Terpene synthase (TPS) genes associated with flower fragrance are considerably amplified in the form of gene clusters through tandem duplications in the genome. Gene clusters within the salicylic acid/benzoic acid/theobromine (SABATH) and benzylalcohol <em>O</em>-acetyltransferase/anthocyanin <em>O</em>-hydroxycinnamoyltransferases/anthranilate <em>N</em>-hydroxycinnamoyl/benzoyltransferase/deacetylvindoline 4-<em>O</em>-acetyltransferase (BAHD) superfamilies were identified to be related to the biosynthesis of phenylpropanoid/benzenoid compounds. Several key genes involved in <strong>jasmonate</strong> biosynthesis were duplicated, causing an increase in copy numbers. In addition, multi-omics analyses identified various aromatic compounds and many genes involved in fragrance biosynthesis pathways. Furthermore, the roles of <em>JsTPS3</em> in β-ocimene biosynthesis, as well as <em>JsAOC1</em> and <em>JsAOS</em> in jasmonic acid biosynthesis, were functionally validated. The genome assembled in this study for <em>J. sambac</em> offers a basic genetic resource for studying floral scent and jasmonate biosynthesis, and provides a foundation for functional genomic research and variety improvements in <em>Jasminum</em>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 127-149"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9882382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.10.001
Shuangsang Fang , Bichao Chen , Yong Zhang , Haixi Sun , Longqi Liu , Shiping Liu , Yuxiang Li , Xun Xu
The development of spatial transcriptomics (ST) technologies has transformed genetic research from a single-cell data level to a two-dimensional spatial coordinate system and facilitated the study of the composition and function of various cell subsets in different environments and organs. The large-scale data generated by these ST technologies, which contain spatial gene expression information, have elicited the need for spatially resolved approaches to meet the requirements of computational and biological data interpretation. These requirements include dealing with the explosive growth of data to determine the cell-level and gene-level expression, correcting the inner batch effect and loss of expression to improve the data quality, conducting efficient interpretation and in-depth knowledge mining both at the single-cell and tissue-wide levels, and conducting multi-omics integration analysis to provide an extensible framework toward the in-depth understanding of biological processes. However, algorithms designed specifically for ST technologies to meet these requirements are still in their infancy. Here, we review computational approaches to these problems in light of corresponding issues and challenges, and present forward-looking insights into algorithm development.
{"title":"Computational Approaches and Challenges in Spatial Transcriptomics","authors":"Shuangsang Fang , Bichao Chen , Yong Zhang , Haixi Sun , Longqi Liu , Shiping Liu , Yuxiang Li , Xun Xu","doi":"10.1016/j.gpb.2022.10.001","DOIUrl":"10.1016/j.gpb.2022.10.001","url":null,"abstract":"<div><p>The development of <strong>spatial transcriptomics</strong> (ST) technologies has transformed genetic research from a single-cell data level to a two-dimensional spatial coordinate system and facilitated the study of the composition and function of various cell subsets in different environments and organs. The large-scale data generated by these ST technologies, which contain spatial gene expression information, have elicited the need for spatially resolved approaches to meet the requirements of computational and biological <strong>data interpretation</strong>. These requirements include dealing with the explosive growth of data to determine the cell-level and gene-level expression, correcting the inner batch effect and loss of expression to improve the <strong>data quality</strong>, conducting efficient interpretation and in-depth knowledge mining both at the single-cell and tissue-wide levels, and conducting <strong>multi-omics integration</strong> analysis to provide an extensible framework toward the in-depth understanding of biological processes. However, algorithms designed specifically for ST technologies to meet these requirements are still in their infancy. Here, we review <strong>computational approaches</strong> to these problems in light of corresponding issues and challenges, and present forward-looking insights into algorithm development.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 24-47"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372921/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9884387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.06.004
Renfei Ma , Shangfu Li , Wenshuo Li , Lantian Yao , Hsien-Da Huang , Tzong-Yi Lee
The purpose of this work is to enhance KinasePhos, a machine learning-based kinase-specific phosphorylation site prediction tool. Experimentally verified kinase-specific phosphorylation data were collected from PhosphoSitePlus, UniProtKB, the GPS 5.0, and Phospho.ELM. In total, 41,421 experimentally verified kinase-specific phosphorylation sites were identified. A total of 1380 unique kinases were identified, including 753 with existing classification information from KinBase and the remaining 627 annotated by building a phylogenetic tree. Based on this kinase classification, a total of 771 predictive models were built at the individual, family, and group levels, using at least 15 experimentally verified substrate sites in positive training datasets. The improved models demonstrated their effectiveness compared with other prediction tools. For example, the prediction of sites phosphorylated by the protein kinase B, casein kinase 2, and protein kinase A families had accuracies of 94.5%, 92.5%, and 90.0%, respectively. The average prediction accuracy for all 771 models was 87.2%. For enhancing interpretability, the SHapley Additive exPlanations (SHAP) method was employed to assess feature importance. The web interface of KinasePhos 3.0 has been redesigned to provide comprehensive annotations of kinase-specific phosphorylation sites on multiple proteins. Additionally, considering the large scale of phosphoproteomic data, a downloadable prediction tool is available at https://awi.cuhk.edu.cn/KinasePhos/download.html or https://github.com/tom-209/KinasePhos-3.0-executable-file.
{"title":"KinasePhos 3.0: Redesign and Expansion of the Prediction on Kinase-specific Phosphorylation Sites","authors":"Renfei Ma , Shangfu Li , Wenshuo Li , Lantian Yao , Hsien-Da Huang , Tzong-Yi Lee","doi":"10.1016/j.gpb.2022.06.004","DOIUrl":"10.1016/j.gpb.2022.06.004","url":null,"abstract":"<div><p>The purpose of this work is to enhance KinasePhos, a machine learning-based <strong>kinase-specific phosphorylation site prediction</strong> tool. Experimentally verified kinase-specific phosphorylation data were collected from PhosphoSitePlus, UniProtKB, the GPS 5.0, and Phospho.ELM. In total, 41,421 experimentally verified kinase-specific phosphorylation sites were identified. A total of 1380 unique kinases were identified, including 753 with existing classification information from KinBase and the remaining 627 annotated by building a phylogenetic tree. Based on this kinase classification, a total of 771 predictive models were built at the individual, family, and group levels, using at least 15 experimentally verified substrate sites in positive training datasets. The improved models demonstrated their effectiveness compared with other prediction tools. For example, the prediction of sites phosphorylated by the protein kinase B, casein kinase 2, and protein kinase A families had accuracies of 94.5%, 92.5%, and 90.0%, respectively. The average prediction accuracy for all 771 models was 87.2%. For enhancing interpretability, the SHapley Additive exPlanations (SHAP) method was employed to assess feature importance. The web interface of KinasePhos 3.0 has been redesigned to provide comprehensive annotations of kinase-specific phosphorylation sites on multiple proteins. Additionally, considering the large scale of phosphoproteomic data, a downloadable prediction tool is available at <span>https://awi.cuhk.edu.cn/KinasePhos/download.html</span><svg><path></path></svg> or <span>https://github.com/tom-209/KinasePhos-3.0-executable-file</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 228-241"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373160/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9890344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2023.01.005
Jianzhi Zhang
Genomics, an interdisciplinary field of biology on the structure, function, and evolution of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond—variation, interaction, and selection—and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype–phenotype–fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.
{"title":"What Has Genomics Taught An Evolutionary Biologist?","authors":"Jianzhi Zhang","doi":"10.1016/j.gpb.2023.01.005","DOIUrl":"10.1016/j.gpb.2023.01.005","url":null,"abstract":"<div><p>Genomics, an interdisciplinary field of biology on the structure, function, and <strong>evolution</strong> of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond—<strong>variation</strong>, <strong>interaction</strong>, and <strong>selection</strong>—and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype–phenotype–fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 1-12"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373158/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10261052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.02.004
Oh Kwang Kwon , In Hyuk Bang , So Young Choi , Ju Mi Jeon , Ann-Yae Na , Yan Gao , Sam Seok Cho , Sung Hwan Ki , Youngshik Choe , Jun Nyung Lee , Yun-Sok Ha , Eun Ju Bae , Tae Gyun Kwon , Byung-Hyun Park , Sangkyu Lee
Prostate cancer (PCa) is the most commonly diagnosed genital cancer in men worldwide. Around 80% of the patients who developed advanced PCa suffered from bone metastasis, with a sharp drop in the survival rate. Despite great efforts, the detailed mechanisms underlying castration-resistant PCa (CRPC) remain unclear. Sirtuin 5 (SIRT5), an NAD+-dependent desuccinylase, is hypothesized to be a key regulator of various cancers. However, compared to other SIRTs, the role of SIRT5 in cancer has not been extensively studied. Here, we revealed significantly decreased SIRT5 levels in aggressive PCa cells relative to the PCa stages. The correlation between the decrease in the SIRT5 level and the patient’s reduced survival rate was also confirmed. Using quantitative global succinylome analysis, we characterized a significant increase in the succinylation at lysine 118 (K118su) of lactate dehydrogenase A (LDHA), which plays a role in increasing LDH activity. As a substrate of SIRT5, LDHA-K118su significantly increased the migration and invasion of PCa cells and LDH activity in PCa patients. This study reveals the reduction of SIRT5 protein expression and LDHA-K118su as a novel mechanism involved in PCa progression, which could serve as a new target to prevent CPRC progression for PCa treatment.
{"title":"LDHA Desuccinylase Sirtuin 5 as A Novel Cancer Metastatic Stimulator in Aggressive Prostate Cancer","authors":"Oh Kwang Kwon , In Hyuk Bang , So Young Choi , Ju Mi Jeon , Ann-Yae Na , Yan Gao , Sam Seok Cho , Sung Hwan Ki , Youngshik Choe , Jun Nyung Lee , Yun-Sok Ha , Eun Ju Bae , Tae Gyun Kwon , Byung-Hyun Park , Sangkyu Lee","doi":"10.1016/j.gpb.2022.02.004","DOIUrl":"10.1016/j.gpb.2022.02.004","url":null,"abstract":"<div><p>Prostate cancer (PCa) is the most commonly diagnosed genital cancer in men worldwide. Around 80% of the patients who developed advanced PCa suffered from bone metastasis, with a sharp drop in the survival rate. Despite great efforts, the detailed mechanisms underlying castration-resistant PCa (CRPC) remain unclear. Sirtuin 5 (<strong>SIRT5</strong>), an NAD<sup>+</sup>-dependent desuccinylase, is hypothesized to be a key regulator of various cancers. However, compared to other SIRTs, the role of SIRT5 in cancer has not been extensively studied. Here, we revealed significantly decreased SIRT5 levels in aggressive PCa cells relative to the PCa stages. The correlation between the decrease in the SIRT5 level and the patient’s reduced survival rate was also confirmed. Using quantitative global succinylome analysis, we characterized a significant increase in the succinylation at lysine 118 (K118su) of <strong>lactate dehydrogenase A</strong> (LDHA), which plays a role in increasing LDH activity. As a substrate of SIRT5, LDHA-K118su significantly increased the migration and invasion of PCa cells and LDH activity in PCa patients. This study reveals the reduction of SIRT5 protein expression and LDHA-K118su as a novel mechanism involved in PCa progression, which could serve as a new target to prevent CPRC progression for PCa treatment.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 177-189"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9884369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.04.008
Zuguang Gu , Daniel Hübschmann
Functional enrichment analysis or gene set enrichment analysis is a basic bioinformatics method that evaluates the biological importance of a list of genes of interest. However, it may produce a long list of significant terms with highly redundant information that is difficult to summarize. Current tools to simplify enrichment results by clustering them into groups either still produce redundancy between clusters or do not retain consistent term similarities within clusters. We propose a new method named binary cut for clustering similarity matrices of functional terms. Through comprehensive benchmarks on both simulated and real-world datasets, we demonstrated that binary cut could efficiently cluster functional terms into groups where terms showed consistent similarities within groups and were mutually exclusive between groups. We compared binary cut clustering on the similarity matrices obtained from different similarity measures and found that semantic similarity worked well with binary cut, while similarity matrices based on gene overlap showed less consistent patterns. We implemented the binary cut algorithm in the R package simplifyEnrichment, which additionally provides functionalities for visualizing, summarizing, and comparing the clustering. The simplifyEnrichment package and the documentation are available at https://bioconductor.org/packages/simplifyEnrichment/.
{"title":"simplifyEnrichment: A Bioconductor Package for Clustering and Visualizing Functional Enrichment Results","authors":"Zuguang Gu , Daniel Hübschmann","doi":"10.1016/j.gpb.2022.04.008","DOIUrl":"10.1016/j.gpb.2022.04.008","url":null,"abstract":"<div><p><strong>Functional enrichment</strong> analysis or gene set enrichment analysis is a basic bioinformatics method that evaluates the biological importance of a list of genes of interest. However, it may produce a long list of significant terms with highly redundant information that is difficult to summarize. Current tools to <strong>simplify enrichment</strong> results by <strong>clustering</strong> them into groups either still produce redundancy between clusters or do not retain consistent term similarities within clusters. We propose a new method named <em>binary cut</em> for clustering similarity matrices of functional terms. Through comprehensive benchmarks on both simulated and real-world datasets, we demonstrated that <em>binary cut</em> could efficiently cluster functional terms into groups where terms showed consistent similarities within groups and were mutually exclusive between groups. We compared <em>binary cut</em> clustering on the similarity matrices obtained from different similarity measures and found that semantic similarity worked well with <em>binary cut</em>, while similarity matrices based on gene overlap showed less consistent patterns. We implemented the <em>binary cut</em> algorithm in the R package <em>simplifyEnrichment</em>, which additionally provides functionalities for visualizing, summarizing, and comparing the clustering. The <em>simplifyEnrichment</em> package and the documentation are available at <span>https://bioconductor.org/packages/simplifyEnrichment/</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 190-202"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9938752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1016/j.gpb.2022.08.001
Wei-Zhen Zhou , Wenke Li , Huayan Shen , Ruby W. Wang , Wen Chen , Yujing Zhang , Qingyi Zeng , Hao Wang , Meng Yuan , Ziyi Zeng , Jinhui Cui , Chuan-Yun Li , Fred Y. Ye , Zhou Zhou
Congenital heart disease (CHD) is one of the most common causes of major birth defects, with a prevalence of 1%. Although an increasing number of studies have reported the etiology of CHD, the findings scattered throughout the literature are difficult to retrieve and utilize in research and clinical practice. We therefore developed CHDbase, an evidence-based knowledgebase of CHD-related genes and clinical manifestations manually curated from 1114 publications, linking 1124 susceptibility genes and 3591 variations to more than 300 CHD types and related syndromes. Metadata such as the information of each publication and the selected population and samples, the strategy of studies, and the major findings of studies were integrated with each item of the research record. We also integrated functional annotations through parsing ∼ 50 databases/tools to facilitate the interpretation of these genes and variations in disease pathogenicity. We further prioritized the significance of these CHD-related genes with a gene interaction network approach and extracted a core CHD sub-network with 163 genes. The clear genetic landscape of CHD enables the phenotype classification based on the shared genetic origin. Overall, CHDbase provides a comprehensive and freely available resource to study CHD susceptibilities, supporting a wide range of users in the scientific and medical communities. CHDbase is accessible at http://chddb.fwgenetics.org.
{"title":"CHDbase: A Comprehensive Knowledgebase for Congenital Heart Disease-related Genes and Clinical Manifestations","authors":"Wei-Zhen Zhou , Wenke Li , Huayan Shen , Ruby W. Wang , Wen Chen , Yujing Zhang , Qingyi Zeng , Hao Wang , Meng Yuan , Ziyi Zeng , Jinhui Cui , Chuan-Yun Li , Fred Y. Ye , Zhou Zhou","doi":"10.1016/j.gpb.2022.08.001","DOIUrl":"10.1016/j.gpb.2022.08.001","url":null,"abstract":"<div><p><strong>Congenital heart disease</strong> (CHD) is one of the<!--> <!-->most common causes of major birth defects, with a prevalence of 1%. Although an increasing number of studies have reported the etiology of CHD, the findings scattered throughout the literature are difficult to retrieve and utilize in research and clinical practice. We therefore developed CHDbase, an evidence-based knowledgebase of CHD-related genes and clinical manifestations manually curated from 1114 publications, linking 1124<!--> <!-->susceptibility genes and 3591 variations to more than 300 CHD types and related syndromes. Metadata such as the information of each publication and the selected population and samples, the strategy of studies, and the major findings of studies were integrated with each item of the research record. We also integrated functional annotations through parsing ∼ 50 <strong>databases</strong>/tools to facilitate the interpretation of these genes and variations in disease pathogenicity. We further prioritized the significance of these CHD-related genes with a gene interaction network approach and extracted a core CHD sub-network with 163 genes. The clear genetic landscape of CHD enables the phenotype <strong>classification</strong> based on the shared genetic origin. Overall, CHDbase provides a comprehensive and freely available resource to study CHD susceptibilities, supporting a wide range of users in the scientific and medical communities. CHDbase is accessible at <span>http://chddb.fwgenetics.org</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 216-227"},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372913/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9899876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}