首页 > 最新文献

Genomics, proteomics & bioinformatics最新文献

英文 中文
SoyOD: An Integrated Soybean Multi-omics Database for Mining Genes and Biological Research. SoyOD:用于挖掘基因和生物研究的大豆多组学综合数据库。
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae080
Jie Li, Qingyang Ni, Guangqi He, Jiale Huang, Haoyu Chao, Sida Li, Ming Chen, Guoyu Hu, James Whelan, Huixia Shou

Soybean is a globally important crop for food, feed, oil, and nitrogen fixation. A variety of multi-omics studies have been carried out, generating datasets ranging from genotype to phenotype. In order to efficiently utilize these data for basic and applied research, a soybean multi-omics database with extensive data coverage and comprehensive data analysis tools was established. The Soybean Omics Database (SoyOD) integrates important new datasets with existing public datasets to form the most comprehensive collection of soybean multi-omics information. Compared to existing soybean databases, SoyOD incorporates an extensive collection of novel data derived from the deep-sequencing of 984 germplasms, 162 novel transcriptomic datasets from seeds at different developmental stages, 53 phenotypic datasets, and more than 2500 phenotypic images. In addition, SoyOD integrates existing data resources, including 59 assembled genomes, genetic variation data from 3904 soybean accessions, 225 sets of phenotypic data, and 1097 transcriptomic sequences covering 507 different tissues and treatment conditions. Moreover, SoyOD can be used to mine candidate genes for important agronomic traits, as shown in a case study on plant height. Additionally, powerful analytical and easy-to-use toolkits enable users to easily access the available multi-omics datasets, and to rapidly search genotypic and phenotypic data in a particular germplasm. The novelty, comprehensiveness, and user-friendly features of SoyOD make it a valuable resource for soybean molecular breeding and biological research. SoyOD is publicly accessible at https://bis.zju.edu.cn/soyod.

大豆是全球重要的粮食、饲料、油料和固氮作物。目前已开展了多种多组学研究,产生了从基因型到表型的数据集。为了将这些数据有效地用于基础研究和应用研究,一个具有广泛数据覆盖面和全面数据分析工具的大豆多组学数据库应运而生。大豆组学数据库(Soybean Omics Database,SoyOD)整合了重要的新数据集和现有的公共数据集,形成了最全面的大豆多组学信息集合。与现有的大豆数据库相比,SoyOD 收录了来自 984 个种质的深度测序的大量新数据、162 个来自不同发育阶段种子的新转录组数据集、53 个表型数据集和 2500 多张表型图像。此外,SoyOD 还整合了现有的数据资源,包括 59 个组装基因组、来自 3904 个大豆品种的遗传变异数据、225 组表型数据以及涵盖 507 种不同组织和处理条件的 1097 个转录组序列。此外,SoyOD 还可用于挖掘重要农艺性状的候选基因,如有关植株高度的案例研究所示。此外,强大的分析和易用的工具包使用户能够轻松访问可用的多组学数据集,并快速搜索特定种质的基因型和表型数据。SoyOD 的新颖性、全面性和用户友好性使其成为大豆分子育种和生物学研究的宝贵资源。SoyOD 可通过 https://bis.zju.edu.cn/soyod 公开访问。
{"title":"SoyOD: An Integrated Soybean Multi-omics Database for Mining Genes and Biological Research.","authors":"Jie Li, Qingyang Ni, Guangqi He, Jiale Huang, Haoyu Chao, Sida Li, Ming Chen, Guoyu Hu, James Whelan, Huixia Shou","doi":"10.1093/gpbjnl/qzae080","DOIUrl":"10.1093/gpbjnl/qzae080","url":null,"abstract":"<p><p>Soybean is a globally important crop for food, feed, oil, and nitrogen fixation. A variety of multi-omics studies have been carried out, generating datasets ranging from genotype to phenotype. In order to efficiently utilize these data for basic and applied research, a soybean multi-omics database with extensive data coverage and comprehensive data analysis tools was established. The Soybean Omics Database (SoyOD) integrates important new datasets with existing public datasets to form the most comprehensive collection of soybean multi-omics information. Compared to existing soybean databases, SoyOD incorporates an extensive collection of novel data derived from the deep-sequencing of 984 germplasms, 162 novel transcriptomic datasets from seeds at different developmental stages, 53 phenotypic datasets, and more than 2500 phenotypic images. In addition, SoyOD integrates existing data resources, including 59 assembled genomes, genetic variation data from 3904 soybean accessions, 225 sets of phenotypic data, and 1097 transcriptomic sequences covering 507 different tissues and treatment conditions. Moreover, SoyOD can be used to mine candidate genes for important agronomic traits, as shown in a case study on plant height. Additionally, powerful analytical and easy-to-use toolkits enable users to easily access the available multi-omics datasets, and to rapidly search genotypic and phenotypic data in a particular germplasm. The novelty, comprehensiveness, and user-friendly features of SoyOD make it a valuable resource for soybean molecular breeding and biological research. SoyOD is publicly accessible at https://bis.zju.edu.cn/soyod.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757165/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142635076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish. 不同亚基因组的变异和相互作用促成了杂交鱼类的生长多样性。
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae055
Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu

Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulating genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with body weight, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing expression patterns of solute carrier family 2 member 12 (slc2a12) in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.30% of alleles, we observed dominant trans-regulatory effects in the regulatory interactions between distinct alleles from subgenomes R and C. Integrating analyses of allele-specific expression and DNA methylation data revealed that DNA methylation on both subgenomes shaped the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interactions of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allelic traits in animals.

等位基因和非等位基因间的杂交极大地改变了等位基因和非等位基因间的调控相互作用。然而,它们对动物生长多样性的影响仍然知之甚少。在这项研究中,我们对金鱼(Carassius auratus red var.)和鲤鱼(Cyprinus carpio)属间杂交产生的不同杂交品种进行了全基因组测序和 RNA 测序(RNA-seq)分析。这些杂交个体具有不同的线粒体基因组和拷贝数变异。通过加权基因相关网络分析,我们发现了 3693 个候选生长调控基因。其中,R亚基因组(源自金鱼)中3672个基因的表达与生长速度呈负相关,而C亚基因组(源自鲤鱼)中20个基因的表达与生长速度呈正相关。值得注意的是,我们观察到 C 亚基因组中 slc2a12 的表达呈现出耐人寻味的模式,它与体重的相关性与水温的变化相反,这表明杂交动物的摄食活动与体重增加之间存在不同的相互作用,以应对季节变化。综合分析等位基因特异性表达和DNA甲基化数据发现,DNA甲基化对两个亚基因组的影响决定了等位基因表达对生长率的相对贡献。这些发现为了解不同亚基因组之间的相互作用提供了新的视角,而这种相互作用是生长性状异质性的基础,有助于更好地理解动物的多等位基因性状。
{"title":"Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish.","authors":"Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu","doi":"10.1093/gpbjnl/qzae055","DOIUrl":"10.1093/gpbjnl/qzae055","url":null,"abstract":"<p><p>Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulating genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with body weight, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing expression patterns of solute carrier family 2 member 12 (slc2a12) in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.30% of alleles, we observed dominant trans-regulatory effects in the regulatory interactions between distinct alleles from subgenomes R and C. Integrating analyses of allele-specific expression and DNA methylation data revealed that DNA methylation on both subgenomes shaped the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interactions of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allelic traits in animals.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11810642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Targeted Long-read Sequencing Approach Boosts Transcriptomic Profiling. 一种新的靶向长读测序方法促进转录组学分析。
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae090
Xiaolong Tian, Rong Fan
{"title":"A Novel Targeted Long-read Sequencing Approach Boosts Transcriptomic Profiling.","authors":"Xiaolong Tian, Rong Fan","doi":"10.1093/gpbjnl/qzae090","DOIUrl":"10.1093/gpbjnl/qzae090","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding Spatial Complexity of Diverse RNA Species in Archival Tissues. 解码档案组织中多种 RNA 的空间复杂性
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae089
Junjie Zhu, Fangqing Zhao
{"title":"Decoding Spatial Complexity of Diverse RNA Species in Archival Tissues.","authors":"Junjie Zhu, Fangqing Zhao","doi":"10.1093/gpbjnl/qzae089","DOIUrl":"10.1093/gpbjnl/qzae089","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11784585/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing Type II Cytokines to Reinvigorate Exhausted T Cells for Durable Cancer Immunotherapy. 利用II型细胞因子重新激活耗尽的T细胞用于持久的癌症免疫治疗。
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae093
Wenle Zhang, Yanwen Wang, Bin Li
{"title":"Harnessing Type II Cytokines to Reinvigorate Exhausted T Cells for Durable Cancer Immunotherapy.","authors":"Wenle Zhang, Yanwen Wang, Bin Li","doi":"10.1093/gpbjnl/qzae093","DOIUrl":"10.1093/gpbjnl/qzae093","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProtPipe: A Multifunctional Data Analysis Pipeline for Proteomics and Peptidomics. ProtPipe:用于蛋白质组学和肽组学的多功能数据分析管道。
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae083
Ziyi Li, Cory A Weller, Syed Shah, Nicholas L Johnson, Ying Hao, Paige B Jarreau, Jessica Roberts, Deyaan Guha, Colleen Bereda, Sydney Klaisner, Pedro Machado, Matteo Zanovello, Mercedes Prudencio, Björn Oskarsson, Nathan P Staff, Dennis W Dickson, Pietro Fratta, Leonard Petrucelli, Priyanka Narayan, Mark R Cookson, Michael E Ward, Andrew B Singleton, Mike A Nalls, Yue A Qi

Mass spectrometry (MS) is a technique widely employed for the identification and characterization of proteins, with personalized medicine, systems biology, and biomedical applications. The application of MS-based proteomics advances our understanding of protein function, cellular signaling, and complex biological systems. MS data analysis is a critical process that includes identifying and quantifying proteins and peptides and then exploring their biological functions in downstream analyses. To address the complexities associated with MS data analysis, we developed ProtPipe to streamline and automate the processing and analysis of high-throughput proteomics and peptidomics datasets with DIA-NN preinstalled. The pipeline facilitates data quality control, sample filtering, and normalization, ensuring robust and reliable downstream analyses. ProtPipe provides downstream analyses, including protein and peptide differential abundance identification, pathway enrichment analysis, protein-protein interaction analysis, and major histocompatibility complex (MHC)-peptide binding affinity analysis. ProtPipe generates annotated tables and visualizations by performing statistical post-processing and calculating fold changes between predefined pairwise conditions in an experimental design. It is an open-source, well-documented tool available at https://github.com/NIH-CARD/ProtPipe, with a user-friendly web interface.

质谱(MS)是一种广泛应用于蛋白质鉴定和表征的技术,在个性化医疗、系统生物学和生物医学方面都有应用。基于质谱的蛋白质组学的应用促进了我们对蛋白质功能、细胞信号传导和复杂生物系统的了解。质谱数据分析是一个关键过程,包括蛋白质和肽的鉴定和定量,然后在下游分析中探索其生物功能。为了解决 MS 数据分析的复杂性,我们开发了 ProtPipe,以简化和自动化预装 DIA-NN 的高通量蛋白质组学和多肽组学数据集的处理和分析。该管道有助于数据质量控制、样品过滤和归一化,确保下游分析稳健可靠。ProtPipe 提供下游分析,包括蛋白质和多肽差异丰度鉴定、通路富集分析、蛋白质-蛋白质相互作用分析以及主要组织相容性复合体 (MHC) - 多肽结合亲和力分析。ProtPipe 通过执行统计后处理和计算实验设计中预定义配对条件之间的折叠变化,生成带注释的表格和可视化效果。它是一个开源的、文档齐全的工具,可在 https://github.com/NIH-CARD/ProtPipe 上在线获取,具有用户友好的 Web 界面。
{"title":"ProtPipe: A Multifunctional Data Analysis Pipeline for Proteomics and Peptidomics.","authors":"Ziyi Li, Cory A Weller, Syed Shah, Nicholas L Johnson, Ying Hao, Paige B Jarreau, Jessica Roberts, Deyaan Guha, Colleen Bereda, Sydney Klaisner, Pedro Machado, Matteo Zanovello, Mercedes Prudencio, Björn Oskarsson, Nathan P Staff, Dennis W Dickson, Pietro Fratta, Leonard Petrucelli, Priyanka Narayan, Mark R Cookson, Michael E Ward, Andrew B Singleton, Mike A Nalls, Yue A Qi","doi":"10.1093/gpbjnl/qzae083","DOIUrl":"10.1093/gpbjnl/qzae083","url":null,"abstract":"<p><p>Mass spectrometry (MS) is a technique widely employed for the identification and characterization of proteins, with personalized medicine, systems biology, and biomedical applications. The application of MS-based proteomics advances our understanding of protein function, cellular signaling, and complex biological systems. MS data analysis is a critical process that includes identifying and quantifying proteins and peptides and then exploring their biological functions in downstream analyses. To address the complexities associated with MS data analysis, we developed ProtPipe to streamline and automate the processing and analysis of high-throughput proteomics and peptidomics datasets with DIA-NN preinstalled. The pipeline facilitates data quality control, sample filtering, and normalization, ensuring robust and reliable downstream analyses. ProtPipe provides downstream analyses, including protein and peptide differential abundance identification, pathway enrichment analysis, protein-protein interaction analysis, and major histocompatibility complex (MHC)-peptide binding affinity analysis. ProtPipe generates annotated tables and visualizations by performing statistical post-processing and calculating fold changes between predefined pairwise conditions in an experimental design. It is an open-source, well-documented tool available at https://github.com/NIH-CARD/ProtPipe, with a user-friendly web interface.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842048/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of T Cell Receptor Construction Methods from scRNA-Seq Data. 基于scRNA-Seq数据的T细胞受体构建方法评价
Pub Date : 2025-01-15 DOI: 10.1093/gpbjnl/qzae086
Ruonan Tian, Zhejian Yu, Ziwei Xue, Jiaxin Wu, Lize Wu, Shuo Cai, Bing Gao, Bing He, Yu Zhao, Jianhua Yao, Linrong Lu, Wanlu Liu

T cell receptors (TCRs) serve key roles in the adaptive immune system by enabling recognition and response to pathogens and irregular cells. Various methods have been developed for TCR construction from single-cell RNA sequencing (scRNA-seq) datasets, each with its unique characteristics. Yet, a comprehensive evaluation of their relative performance under different conditions remains elusive. In this study, we conducted a benchmark analysis utilizing experimental single-cell immune profiling datasets. Additionally, we introduced a novel simulator, YASIM-scTCR (Yet Another SIMulator for single-cell TCR), capable of generating scTCR-seq reads containing diverse TCR-derived sequences with different sequencing depths and read lengths. Our results consistently showed that TRUST4 and MiXCR outperformed others across multiple datasets, while DeRR demonstrated considerable accuracy. We also discovered that the sequencing depth inherently imposes a critical constraint on successful TCR construction from scRNA-seq data. In summary, we present a benchmark study to aid researchers in choosing the appropriate method for reconstructing TCRs from scRNA-seq data.

T细胞受体(TCRs)在适应性免疫系统中发挥关键作用,使病原体和不规则细胞能够识别和应答。从单细胞RNA测序(scRNA-seq)数据集构建TCR的方法多种多样,每种方法都有其独特的特点。然而,对它们在不同条件下的相对性能的综合评价仍然是难以捉摸的。在这项研究中,我们利用实验性单细胞免疫图谱数据集进行了基准分析。此外,我们引入了一种新颖的模拟器,YASIM-scTCR (Yet Another simulator for single-cell TCR),能够生成包含不同测序深度和读取长度的不同TCR衍生序列的scTCR-seq reads。我们的结果一致表明,TRUST4和MiXCR在多个数据集上的表现优于其他方法,而DeRR也表现出相当高的准确性。我们还发现,测序深度固有地对从scRNA-seq数据中成功构建TCR施加了关键约束。综上所述,我们提出了一项基准研究,以帮助研究人员选择合适的方法从scRNA-seq数据中重建TCR。
{"title":"Evaluation of T Cell Receptor Construction Methods from scRNA-Seq Data.","authors":"Ruonan Tian, Zhejian Yu, Ziwei Xue, Jiaxin Wu, Lize Wu, Shuo Cai, Bing Gao, Bing He, Yu Zhao, Jianhua Yao, Linrong Lu, Wanlu Liu","doi":"10.1093/gpbjnl/qzae086","DOIUrl":"10.1093/gpbjnl/qzae086","url":null,"abstract":"<p><p>T cell receptors (TCRs) serve key roles in the adaptive immune system by enabling recognition and response to pathogens and irregular cells. Various methods have been developed for TCR construction from single-cell RNA sequencing (scRNA-seq) datasets, each with its unique characteristics. Yet, a comprehensive evaluation of their relative performance under different conditions remains elusive. In this study, we conducted a benchmark analysis utilizing experimental single-cell immune profiling datasets. Additionally, we introduced a novel simulator, YASIM-scTCR (Yet Another SIMulator for single-cell TCR), capable of generating scTCR-seq reads containing diverse TCR-derived sequences with different sequencing depths and read lengths. Our results consistently showed that TRUST4 and MiXCR outperformed others across multiple datasets, while DeRR demonstrated considerable accuracy. We also discovered that the sequencing depth inherently imposes a critical constraint on successful TCR construction from scRNA-seq data. In summary, we present a benchmark study to aid researchers in choosing the appropriate method for reconstructing TCRs from scRNA-seq data.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142820279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HemaCisDB: An Interactive Database for Analyzing Cis-Regulatory Elements Across Hematopoietic Malignancies. HemaCisDB:分析各种造血恶性肿瘤顺式调节元件的交互式数据库。
Pub Date : 2024-12-26 DOI: 10.1093/gpbjnl/qzae088
Xinping Cai, Qianru Zhang, Bolin Liu, Lu Sun, Yuxuan Liu

Noncoding cis-regulatory elements (CREs), such as transcriptional enhancers, are key regulators of gene expression programs. Accessible chromatin and H3K27ac are well-recognized markers for CREs associated with their biological function. Deregulation of CREs is commonly found in hematopoietic malignancies yet the extent to which CRE dysfunction contributes to pathophysiology remains incompletely understood. Here, we developed HemaCisDB, an interactive, comprehensive, and centralized online resource for CRE characterization across hematopoietic malignancies, serving as a useful resource for investigating the pathological roles of CREs in blood disorders. Currently, we collected 922 ATAC-seq, 190 DNase-seq, and 531 H3K27ac ChIP-seq datasets from patient samples and cell lines across different myeloid and lymphoid neoplasms. HemaCisDB provides comprehensive quality control metrics to assess ATAC-seq, DNase-seq, and H3K27ac ChIP-seq data quality. The analytic modules in HemaCisDB include transcription factor (TF) footprinting inference, super-enhancer identification, and core transcriptional regulatory circuitry analysis. Moreover, HemaCisDB also enables the study of TF binding dynamics by comparing TF footprints across different disease types or conditions via web-based interactive analysis. Together, HemaCisDB provides an interactive platform for CRE characterization to facilitate mechanistic studies of transcriptional regulation in hematopoietic malignancies. HemaCisDB is available at https://hemacisdb.chinablood.com.cn/.

非编码顺式调控元件(CREs),如转录增强子,是基因表达程序的关键调控因子。可接近的染色质和H3K27ac是公认的与cre生物学功能相关的标志物。在造血系统恶性肿瘤中,通常发现CRE的失调,但CRE功能障碍对病理生理的影响程度仍不完全清楚。在这里,我们开发了HemaCisDB,这是一个交互式的、全面的、集中的在线资源,用于研究造血恶性肿瘤的CRE特征,作为研究CRE在血液疾病中的病理作用的有用资源。目前,我们收集了922个ATAC-seq, 190个DNase-seq和531个H3K27ac ChIP-seq数据集,这些数据集来自不同骨髓和淋巴肿瘤的患者样本和细胞系。HemaCisDB提供全面的质量控制指标来评估ATAC-seq、DNase-seq和H3K27ac ChIP-seq数据质量。HemaCisDB的分析模块包括转录因子(TF)足迹推断,超级增强子鉴定和核心转录调控电路分析。此外,HemaCisDB还可以通过基于web的交互式分析,比较不同疾病类型或条件下的TF足迹,从而研究TF结合动力学。HemaCisDB为CRE表征提供了一个互动平台,以促进造血恶性肿瘤转录调控的机制研究。HemaCisDB可在https://hemacisdb.chinablood.com.cn/获得。
{"title":"HemaCisDB: An Interactive Database for Analyzing Cis-Regulatory Elements Across Hematopoietic Malignancies.","authors":"Xinping Cai, Qianru Zhang, Bolin Liu, Lu Sun, Yuxuan Liu","doi":"10.1093/gpbjnl/qzae088","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae088","url":null,"abstract":"<p><p>Noncoding cis-regulatory elements (CREs), such as transcriptional enhancers, are key regulators of gene expression programs. Accessible chromatin and H3K27ac are well-recognized markers for CREs associated with their biological function. Deregulation of CREs is commonly found in hematopoietic malignancies yet the extent to which CRE dysfunction contributes to pathophysiology remains incompletely understood. Here, we developed HemaCisDB, an interactive, comprehensive, and centralized online resource for CRE characterization across hematopoietic malignancies, serving as a useful resource for investigating the pathological roles of CREs in blood disorders. Currently, we collected 922 ATAC-seq, 190 DNase-seq, and 531 H3K27ac ChIP-seq datasets from patient samples and cell lines across different myeloid and lymphoid neoplasms. HemaCisDB provides comprehensive quality control metrics to assess ATAC-seq, DNase-seq, and H3K27ac ChIP-seq data quality. The analytic modules in HemaCisDB include transcription factor (TF) footprinting inference, super-enhancer identification, and core transcriptional regulatory circuitry analysis. Moreover, HemaCisDB also enables the study of TF binding dynamics by comparing TF footprints across different disease types or conditions via web-based interactive analysis. Together, HemaCisDB provides an interactive platform for CRE characterization to facilitate mechanistic studies of transcriptional regulation in hematopoietic malignancies. HemaCisDB is available at https://hemacisdb.chinablood.com.cn/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COCOA: A Framework for Fine-scale Mapping Cell-type-specific Chromatin Compartments with Epigenomic Information. COCOA:利用表观基因组信息绘制细胞类型特异性染色质区室精细图谱的框架。
Pub Date : 2024-12-26 DOI: 10.1093/gpbjnl/qzae091
Kai Li, Ping Zhang, Jinsheng Xu, Zi Wen, Junying Zhang, Zhike Zi, Li Li

Chromatin compartmentalization and epigenomic modification are crucial in cell differentiation and diseases development. However, precise mapping of chromatin compartmental patterns requires Hi-C or Micro-C data at high sequencing depth. Exploring the systematic relationship between epigenomic modifications and compartmental patterns remains challenging. To address these issues, we present COCOA, a deep neural network framework using convolution and attention mechanisms to infer fine-scale chromatin compartment patterns from six histone modification signals. COCOA extracts 1-D track features through bi-directional feature reconstruction after resolution-specific binning epigenomic signals. These track features are then cross-fused with contact features using an attention mechanism and transformed into chromatin compartment patterns through residual feature reduction. COCOA demonstrates accurate inference of chromatin compartmentalization at a fine-scale resolution and exhibits stable performance on test sets. Additionally, we explored the impact of histone modifications on chromatin compartmentalization prediction through in silico epigenomic perturbation experiments. Unlike obscure compartments observed with 1 kb resolution high-depth experimental data, COCOA generates clear and detailed compartmental patterns, highlighting its superior performance. Finally, we demonstrated that COCOA enables cell-type-specific prediction of unrevealed chromatin compartment patterns in various biological processes, making it an effective tool for gaining chromatin compartmentalization insights from epigenomics in diverse biological scenarios. The COCOA python code is publicly available at https://github.com/onlybugs/COCOA.

染色质区隔化和表观基因组修饰是细胞分化和疾病发展的关键。然而,染色质区室模式的精确映射需要高测序深度的Hi-C或Micro-C数据。探索表观基因组修饰和区室模式之间的系统关系仍然具有挑战性。为了解决这些问题,我们提出了COCOA,这是一个使用卷积和注意机制的深度神经网络框架,可以从六个组蛋白修饰信号中推断出精细尺度的染色质室模式。COCOA通过对分辨率特定的表观基因组信号进行分组后的双向特征重建提取一维轨迹特征。然后使用注意机制将这些轨迹特征与接触特征交叉融合,并通过残差特征还原转化为染色质隔室模式。COCOA在精细分辨率下展示了染色质区隔的准确推断,并在测试集上表现出稳定的性能。此外,我们通过硅表观基因组扰动实验探索了组蛋白修饰对染色质区隔化预测的影响。与1 kb分辨率高深度实验数据观察到的模糊区室不同,COCOA生成了清晰详细的区室模式,突出了其优越的性能。最后,我们证明了COCOA能够在各种生物过程中对未揭示的染色质区隔模式进行细胞类型特异性预测,使其成为在不同生物场景中从表观基因组学获得染色质区隔化见解的有效工具。COCOA python代码可在https://github.com/onlybugs/COCOA公开获取。
{"title":"COCOA: A Framework for Fine-scale Mapping Cell-type-specific Chromatin Compartments with Epigenomic Information.","authors":"Kai Li, Ping Zhang, Jinsheng Xu, Zi Wen, Junying Zhang, Zhike Zi, Li Li","doi":"10.1093/gpbjnl/qzae091","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae091","url":null,"abstract":"<p><p>Chromatin compartmentalization and epigenomic modification are crucial in cell differentiation and diseases development. However, precise mapping of chromatin compartmental patterns requires Hi-C or Micro-C data at high sequencing depth. Exploring the systematic relationship between epigenomic modifications and compartmental patterns remains challenging. To address these issues, we present COCOA, a deep neural network framework using convolution and attention mechanisms to infer fine-scale chromatin compartment patterns from six histone modification signals. COCOA extracts 1-D track features through bi-directional feature reconstruction after resolution-specific binning epigenomic signals. These track features are then cross-fused with contact features using an attention mechanism and transformed into chromatin compartment patterns through residual feature reduction. COCOA demonstrates accurate inference of chromatin compartmentalization at a fine-scale resolution and exhibits stable performance on test sets. Additionally, we explored the impact of histone modifications on chromatin compartmentalization prediction through in silico epigenomic perturbation experiments. Unlike obscure compartments observed with 1 kb resolution high-depth experimental data, COCOA generates clear and detailed compartmental patterns, highlighting its superior performance. Finally, we demonstrated that COCOA enables cell-type-specific prediction of unrevealed chromatin compartment patterns in various biological processes, making it an effective tool for gaining chromatin compartmentalization insights from epigenomics in diverse biological scenarios. The COCOA python code is publicly available at https://github.com/onlybugs/COCOA.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCREEN: A Graph-based Contrastive Learning Tool to Infer Catalytic Residues and Assess Enzyme Mutations. 筛选:一个基于图的对比学习工具,以推断催化残基和评估酶突变。
Pub Date : 2024-12-26 DOI: 10.1093/gpbjnl/qzae094
Tong Pan, Yue Bi, Xiaoyu Wang, Ying Zhang, Geoffrey I Webb, Robin B Gasser, Lukasz Kurgan, Jiangning Song

The accurate identification of catalytic residues contributes to our understanding of enzyme functions in biological processes and pathways. The increasing number of protein sequences necessitates computational tools for the automated prediction of catalytic residues in enzymes. Here, we introduce SCREEN, a graph neural network for the high-throughput prediction of catalytic residues via the integration of enzyme functional and structural information. SCREEN constructs residue representations based on spatial arrangements and incorporates enzyme function priors into such representations through contrastive learning. We demonstrate that SCREEN (i) consistently outperforms currently-available predictors; (ii) provides accurate.

Results: when applied to inferred enzyme structures; and (iii) generalizes well to enzymes dissimilar from those in the training set. We also show that the putative catalytic residues predicted by SCREEN mimic key structural and biophysical characteristics of native catalytic residues. Moreover, using experimental data sets, we show that SCREEN's predictions can be used to distinguish residues with a high mutation tolerance from those likely to cause functional loss when mutated, indicating that this tool might be used to infer disease-associated mutations. SCREEN is publicly available at https://github.com/BioColLab/SCREEN and https://ngdc.cncb.ac.cn/biocode/tool/7580.

催化残基的准确鉴定有助于我们理解酶在生物过程和途径中的功能。越来越多的蛋白质序列需要计算工具来自动预测酶的催化残基。在这里,我们介绍SCREEN,一个通过整合酶的功能和结构信息来高通量预测催化残基的图神经网络。SCREEN构建基于空间排列的残基表示,并通过对比学习将酶功能先验纳入到残基表示中。我们证明SCREEN (i)始终优于当前可用的预测器;(ii)提供准确。结果:当应用于推断酶结构时;并且(iii)可以很好地推广到与训练集中的酶不同的酶。我们还表明,通过SCREEN预测的推定催化残基模拟了天然催化残基的关键结构和生物物理特征。此外,使用实验数据集,我们表明SCREEN的预测可用于区分具有高突变耐受性的残基与突变时可能导致功能丧失的残基,这表明该工具可用于推断疾病相关突变。SCREEN可在https://github.com/BioColLab/SCREEN和https://ngdc.cncb.ac.cn/biocode/tool/7580公开获取。
{"title":"SCREEN: A Graph-based Contrastive Learning Tool to Infer Catalytic Residues and Assess Enzyme Mutations.","authors":"Tong Pan, Yue Bi, Xiaoyu Wang, Ying Zhang, Geoffrey I Webb, Robin B Gasser, Lukasz Kurgan, Jiangning Song","doi":"10.1093/gpbjnl/qzae094","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae094","url":null,"abstract":"<p><p>The accurate identification of catalytic residues contributes to our understanding of enzyme functions in biological processes and pathways. The increasing number of protein sequences necessitates computational tools for the automated prediction of catalytic residues in enzymes. Here, we introduce SCREEN, a graph neural network for the high-throughput prediction of catalytic residues via the integration of enzyme functional and structural information. SCREEN constructs residue representations based on spatial arrangements and incorporates enzyme function priors into such representations through contrastive learning. We demonstrate that SCREEN (i) consistently outperforms currently-available predictors; (ii) provides accurate.</p><p><strong>Results: </strong>when applied to inferred enzyme structures; and (iii) generalizes well to enzymes dissimilar from those in the training set. We also show that the putative catalytic residues predicted by SCREEN mimic key structural and biophysical characteristics of native catalytic residues. Moreover, using experimental data sets, we show that SCREEN's predictions can be used to distinguish residues with a high mutation tolerance from those likely to cause functional loss when mutated, indicating that this tool might be used to infer disease-associated mutations. SCREEN is publicly available at https://github.com/BioColLab/SCREEN and https://ngdc.cncb.ac.cn/biocode/tool/7580.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics, proteomics & bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1