首页 > 最新文献

Genomics, proteomics & bioinformatics最新文献

英文 中文
Nphos: Database and Predictor of Protein N-phosphorylation. Nphos:蛋白质 N-磷酸化数据库和预测器。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae032
Ming-Xiao Zhao, Ruo-Fan Ding, Qiang Chen, Junhua Meng, Fulai Li, Songsen Fu, Biling Huang, Yan Liu, Zhi-Liang Ji, Yufen Zhao

Protein N-phosphorylation is widely present in nature and participates in various biological processes. However, current knowledge on N-phosphorylation is extremely limited compared to that on O-phosphorylation. In this study, we collected 11,710 experimentally verified N-phosphosites of 7344 proteins from 39 species and subsequently constructed the database Nphos to share up-to-date information on protein N-phosphorylation. Upon these substantial data, we characterized the sequential and structural features of protein N-phosphorylation. Moreover, after comparing hundreds of learning models, we chose and optimized gradient boosting decision tree (GBDT) models to predict three types of human N-phosphorylation, achieving mean area under the receiver operating characteristic curve (AUC) values of 90.56%, 91.24%, and 92.01% for pHis, pLys, and pArg, respectively. Meanwhile, we discovered 488,825 distinct N-phosphosites in the human proteome. The models were also deployed in Nphos for interactive N-phosphosite prediction. In summary, this work provides new insights and points for both flexible and focused investigations of N-phosphorylation. It will also facilitate a deeper and more systematic understanding of protein N-phosphorylation modification by providing a data and technical foundation. Nphos is freely available at http://www.bio-add.org/Nphos/ and http://ppodd.org.cn/Nphos/.

蛋白质 N-磷酸化广泛存在于自然界中,并参与各种生物过程。然而,与 O 型磷酸化相比,目前有关 N 型磷酸化的知识极为有限。在这项研究中,我们从 39 个物种的 7344 个蛋白质中收集了 11,710 个经实验验证的 N-磷酸化位点,随后构建了 Nphos 数据库,以分享蛋白质 N-磷酸化的最新信息。在这些大量数据的基础上,我们描述了蛋白质 N-磷酸化的顺序和结构特征。此外,在比较了数百个学习模型后,我们选择并优化了梯度提升决策树(GBDT)模型来预测人类的三种N-磷酸化类型,pHis、pLys和pArg的接收操作特征曲线下的平均面积(AUC)值分别为90.56%、91.24%和92.01%。同时,我们在人类蛋白质组中发现了 488,825 个不同的 N-磷酸位点。这些模型还被部署在 Nphos 中,用于交互式 N-磷酸复合预测。总之,这项工作为灵活而有针对性地研究 N-磷酸化提供了新的见解和要点。它还将通过提供数据和技术基础,促进对蛋白质 N-磷酸化修饰更深入、更系统的了解。Nphos 可在 http://www.bio-add.org/Nphos/ 和 http://ppodd.org.cn/Nphos/ 免费获取。
{"title":"Nphos: Database and Predictor of Protein N-phosphorylation.","authors":"Ming-Xiao Zhao, Ruo-Fan Ding, Qiang Chen, Junhua Meng, Fulai Li, Songsen Fu, Biling Huang, Yan Liu, Zhi-Liang Ji, Yufen Zhao","doi":"10.1093/gpbjnl/qzae032","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae032","url":null,"abstract":"<p><p>Protein N-phosphorylation is widely present in nature and participates in various biological processes. However, current knowledge on N-phosphorylation is extremely limited compared to that on O-phosphorylation. In this study, we collected 11,710 experimentally verified N-phosphosites of 7344 proteins from 39 species and subsequently constructed the database Nphos to share up-to-date information on protein N-phosphorylation. Upon these substantial data, we characterized the sequential and structural features of protein N-phosphorylation. Moreover, after comparing hundreds of learning models, we chose and optimized gradient boosting decision tree (GBDT) models to predict three types of human N-phosphorylation, achieving mean area under the receiver operating characteristic curve (AUC) values of 90.56%, 91.24%, and 92.01% for pHis, pLys, and pArg, respectively. Meanwhile, we discovered 488,825 distinct N-phosphosites in the human proteome. The models were also deployed in Nphos for interactive N-phosphosite prediction. In summary, this work provides new insights and points for both flexible and focused investigations of N-phosphorylation. It will also facilitate a deeper and more systematic understanding of protein N-phosphorylation modification by providing a data and technical foundation. Nphos is freely available at http://www.bio-add.org/Nphos/ and http://ppodd.org.cn/Nphos/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CBioProfiler: A Web and Standalone Pipeline for Cancer Biomarker and Subtype Characterization. CBioProfiler:用于癌症生物标记物和亚型特征描述的网络和独立管道。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae045
Xiaoping Liu, Zisong Wang, Hongjie Shi, Sheng Li, Xinghuan Wang

Cancer is a leading cause of death worldwide, and the identification of biomarkers and subtypes that can predict the long-term survival of cancer patients is essential for their risk stratification, treatment, and prognosis. However, there are currently no standardized tools for exploring cancer biomarkers or subtypes. In this study, we introduced Cancer Biomarker and subtype Profiler (CBioProfiler), a web server and standalone application that includes two pipelines for analyzing cancer biomarkers and subtypes. The cancer biomarker pipeline consists of five modules for identifying and annotating cancer survival-related biomarkers using multiple survival-related machine learning algorithms. The cancer subtype pipeline includes three modules for data preprocessing, subtype identification using multiple unsupervised machine learning methods, and subtype evaluation and validation. CBioProfiler also includes CuratedCancerPrognosisData, a novel R package that integrates reviewed and curated gene expression and clinical data from 268 studies. These studies cover 43 common blood and solid tumors and draw upon 47,686 clinical samples. The web server is available at https://www.cbioprofiler.com/ and https://cbioprofiler.znhospital.cn/CBioProfiler/, and the standalone app and source code can be found at https://github.com/liuxiaoping2020/CBioProfiler.

癌症是导致全球死亡的主要原因之一,而确定能够预测癌症患者长期生存的生物标志物和亚型对于癌症患者的风险分层、治疗和预后至关重要。然而,目前还没有用于探索癌症生物标志物或亚型的标准化工具。在这项研究中,我们介绍了癌症生物标记物和亚型分析器(CBioProfiler),它是一个网络服务器和独立应用程序,包括两个用于分析癌症生物标记物和亚型的管道。癌症生物标志物管道由五个模块组成,用于使用多种与生存相关的机器学习算法识别和注释与癌症生存相关的生物标志物。癌症亚型管道包括三个模块,分别用于数据预处理、使用多种无监督机器学习方法进行亚型识别以及亚型评估和验证。CBioProfiler 还包括 CuratedCancerPrognosisData,这是一个新颖的 R 软件包,整合了来自 268 项研究的经过审查和整理的基因表达和临床数据。这些研究涵盖 43 种常见的血液肿瘤和实体瘤,提取了 47,686 份临床样本。网络服务器位于 https://www.cbioprofiler.com/ 和 https://cbioprofiler.znhospital.cn/CBioProfiler/,独立应用程序和源代码位于 https://github.com/liuxiaoping2020/CBioProfiler。
{"title":"CBioProfiler: A Web and Standalone Pipeline for Cancer Biomarker and Subtype Characterization.","authors":"Xiaoping Liu, Zisong Wang, Hongjie Shi, Sheng Li, Xinghuan Wang","doi":"10.1093/gpbjnl/qzae045","DOIUrl":"10.1093/gpbjnl/qzae045","url":null,"abstract":"<p><p>Cancer is a leading cause of death worldwide, and the identification of biomarkers and subtypes that can predict the long-term survival of cancer patients is essential for their risk stratification, treatment, and prognosis. However, there are currently no standardized tools for exploring cancer biomarkers or subtypes. In this study, we introduced Cancer Biomarker and subtype Profiler (CBioProfiler), a web server and standalone application that includes two pipelines for analyzing cancer biomarkers and subtypes. The cancer biomarker pipeline consists of five modules for identifying and annotating cancer survival-related biomarkers using multiple survival-related machine learning algorithms. The cancer subtype pipeline includes three modules for data preprocessing, subtype identification using multiple unsupervised machine learning methods, and subtype evaluation and validation. CBioProfiler also includes CuratedCancerPrognosisData, a novel R package that integrates reviewed and curated gene expression and clinical data from 268 studies. These studies cover 43 common blood and solid tumors and draw upon 47,686 clinical samples. The web server is available at https://www.cbioprofiler.com/ and https://cbioprofiler.znhospital.cn/CBioProfiler/, and the standalone app and source code can be found at https://github.com/liuxiaoping2020/CBioProfiler.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141312596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenBase: A Nucleotide Sequence Database. GenBase:核苷酸序列数据库。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae047
Congfan Bu, Xinchang Zheng, Xuetong Zhao, Tianyi Xu, Xue Bai, Yaokai Jia, Meili Chen, Lili Hao, Jingfa Xiao, Zhang Zhang, Wenming Zhao, Bixia Tang, Yiming Bao

The rapid advancement of sequencing technologies poses challenges in managing the large volume and exponential growth of sequence data efficiently and on time. To address this issue, we present GenBase (https://ngdc.cncb.ac.cn/genbase), an open-access data repository that follows the International Nucleotide Sequence Database Collaboration (INSDC) data standards and structures, for efficient nucleotide sequence archiving, searching, and sharing. As a core resource within the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GenBase offers bilingual submission pipeline and services, as well as local submission assistance in China. GenBase also provides a unique Excel format for metadata description and feature annotation of nucleotide sequences, along with a real-time data validation system to streamline sequence submissions. As of April 23, 2024, GenBase received 68,251 nucleotide sequences and 689,574 annotated protein sequences across 414 species from 2319 submissions. Out of these, 63,614 (93%) nucleotide sequences and 620,640 (90%) annotated protein sequences have been released and are publicly accessible through GenBase's web search system, File Transfer Protocol (FTP), and Application Programming Interface (API). Additionally, in collaboration with INSDC, GenBase has constructed an effective data exchange mechanism with GenBank and started sharing released nucleotide sequences. Furthermore, GenBase integrates all sequences from GenBank with daily updates, demonstrating its commitment to actively contributing to global sequence data management and sharing.

测序技术的飞速发展给高效及时地管理大量指数级增长的序列数据带来了挑战。为了解决这个问题,我们提出了GenBase(https://ngdc.cncb.ac.cn/genbase),一个遵循国际核苷酸序列数据库合作组织(INSDC)数据标准和结构的开放存取的数据资源库,用于高效的核苷酸序列归档、搜索和共享。作为中国国家生物信息中心(CNCB; https://ngdc.cncb.ac.cn)国家基因组学数据中心(NGDC)的核心资源,GenBase提供双语提交管道和服务,以及中国本地的提交协助。GenBase 还提供独特的 Excel 格式,用于核苷酸序列的元数据描述和特征注释,以及实时数据验证系统,以简化序列提交流程。截至2024年4月23日,GenBase共收到来自2319个提交的414个物种的68,251个核苷酸序列和689,574个注释蛋白质序列。其中,63,614条(93%)核苷酸序列和620,640条(90%)注释蛋白质序列已经发布,并可通过GenBase的网络搜索系统、文件传输协议(FTP)和应用编程接口(API)公开访问。此外,GenBase 还与 INSDC 合作,与 GenBank 建立了有效的数据交换机制,开始共享已发布的核苷酸序列。此外,GenBase 还整合了 GenBank 中的所有序列,并每日进行更新,这表明 GenBase 致力于为全球序列数据管理和共享做出积极贡献。
{"title":"GenBase: A Nucleotide Sequence Database.","authors":"Congfan Bu, Xinchang Zheng, Xuetong Zhao, Tianyi Xu, Xue Bai, Yaokai Jia, Meili Chen, Lili Hao, Jingfa Xiao, Zhang Zhang, Wenming Zhao, Bixia Tang, Yiming Bao","doi":"10.1093/gpbjnl/qzae047","DOIUrl":"10.1093/gpbjnl/qzae047","url":null,"abstract":"<p><p>The rapid advancement of sequencing technologies poses challenges in managing the large volume and exponential growth of sequence data efficiently and on time. To address this issue, we present GenBase (https://ngdc.cncb.ac.cn/genbase), an open-access data repository that follows the International Nucleotide Sequence Database Collaboration (INSDC) data standards and structures, for efficient nucleotide sequence archiving, searching, and sharing. As a core resource within the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GenBase offers bilingual submission pipeline and services, as well as local submission assistance in China. GenBase also provides a unique Excel format for metadata description and feature annotation of nucleotide sequences, along with a real-time data validation system to streamline sequence submissions. As of April 23, 2024, GenBase received 68,251 nucleotide sequences and 689,574 annotated protein sequences across 414 species from 2319 submissions. Out of these, 63,614 (93%) nucleotide sequences and 620,640 (90%) annotated protein sequences have been released and are publicly accessible through GenBase's web search system, File Transfer Protocol (FTP), and Application Programming Interface (API). Additionally, in collaboration with INSDC, GenBase has constructed an effective data exchange mechanism with GenBank and started sharing released nucleotide sequences. Furthermore, GenBase integrates all sequences from GenBank with daily updates, demonstrating its commitment to actively contributing to global sequence data management and sharing.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11434157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for AI Breeding in Plants. 植物人工智能育种的机器学习。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae051
Qian Cheng, Xiangfeng Wang
{"title":"Machine Learning for AI Breeding in Plants.","authors":"Qian Cheng, Xiangfeng Wang","doi":"10.1093/gpbjnl/qzae051","DOIUrl":"10.1093/gpbjnl/qzae051","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11479635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish. 不同亚基因组的变异和相互作用促成了杂交鱼类的生长多样性。
Pub Date : 2024-07-23 DOI: 10.1093/gpbjnl/qzae055
Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu

Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing (RNA-seq) analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulated genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with growth rate, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing patterns in the expression of slc2a12 in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.31% of alleles, we observed dominant trans-regulatory effects in the regulatory interaction between distinct alleles from subgenomes R and C. Integrating analyses of allelic-specific expression and DNA methylation data revealed that the influence of DNA methylation on both subgenomes shapes the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interaction of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allele traits in animals.

等位基因和非等位基因间的杂交极大地改变了等位基因和非等位基因间的调控相互作用。然而,它们对动物生长多样性的影响仍然知之甚少。在这项研究中,我们对金鱼(Carassius auratus red var.)和鲤鱼(Cyprinus carpio)属间杂交产生的不同杂交品种进行了全基因组测序和 RNA 测序(RNA-seq)分析。这些杂交个体具有不同的线粒体基因组和拷贝数变异。通过加权基因相关网络分析,我们发现了 3693 个候选生长调控基因。其中,R亚基因组(源自金鱼)中3672个基因的表达与生长速度呈负相关,而C亚基因组(源自鲤鱼)中20个基因的表达与生长速度呈正相关。值得注意的是,我们观察到 C 亚基因组中 slc2a12 的表达呈现出耐人寻味的模式,它与体重的相关性与水温的变化相反,这表明杂交动物的摄食活动与体重增加之间存在不同的相互作用,以应对季节变化。综合分析等位基因特异性表达和DNA甲基化数据发现,DNA甲基化对两个亚基因组的影响决定了等位基因表达对生长率的相对贡献。这些发现为了解不同亚基因组之间的相互作用提供了新的视角,而这种相互作用是生长性状异质性的基础,有助于更好地理解动物的多等位基因性状。
{"title":"Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish.","authors":"Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu","doi":"10.1093/gpbjnl/qzae055","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae055","url":null,"abstract":"<p><p>Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing (RNA-seq) analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulated genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with growth rate, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing patterns in the expression of slc2a12 in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.31% of alleles, we observed dominant trans-regulatory effects in the regulatory interaction between distinct alleles from subgenomes R and C. Integrating analyses of allelic-specific expression and DNA methylation data revealed that the influence of DNA methylation on both subgenomes shapes the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interaction of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allele traits in animals.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opportunities and Challenges in Advancing Plant Research with Single-cell Omics. 利用单细胞组学推进植物研究的机遇与挑战。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae026
Mohammad Saidur Rhaman, Muhammad Ali, Wenxiu Ye, Bosheng Li

Plants possess diverse cell types and intricate regulatory mechanisms to adapt to the ever-changing environment of nature. Various strategies have been employed to study cell types and their developmental progressions, including single-cell sequencing methods which provide high-dimensional catalogs to address biological concerns. In recent years, single-cell sequencing technologies in transcriptomics, epigenomics, proteomics, metabolomics, and spatial transcriptomics have been increasingly used in plant science to reveal intricate biological relationships at the single-cell level. However, the application of single-cell technologies to plants is more limited due to the challenges posed by cell structure. This review outlines the advancements in single-cell omics technologies, their implications in plant systems, future research applications, and the challenges of single-cell omics in plant systems.

植物拥有多种细胞类型和复杂的调节机制,以适应不断变化的自然环境。人们采用了各种策略来研究细胞类型及其发育过程,其中包括单细胞测序方法,这种方法可提供高维目录来解决生物学问题。近年来,转录组学、表观基因组学、蛋白质组学、代谢组学和空间转录组学等单细胞测序技术越来越多地应用于植物科学,以揭示单细胞水平上错综复杂的生物学关系。然而,由于细胞结构所带来的挑战,单细胞技术在植物中的应用较为有限。本综述概述了单细胞全息技术的进展、其对植物系统的影响、未来的研究应用以及单细胞全息技术在植物系统中的挑战。
{"title":"Opportunities and Challenges in Advancing Plant Research with Single-cell Omics.","authors":"Mohammad Saidur Rhaman, Muhammad Ali, Wenxiu Ye, Bosheng Li","doi":"10.1093/gpbjnl/qzae026","DOIUrl":"10.1093/gpbjnl/qzae026","url":null,"abstract":"<p><p>Plants possess diverse cell types and intricate regulatory mechanisms to adapt to the ever-changing environment of nature. Various strategies have been employed to study cell types and their developmental progressions, including single-cell sequencing methods which provide high-dimensional catalogs to address biological concerns. In recent years, single-cell sequencing technologies in transcriptomics, epigenomics, proteomics, metabolomics, and spatial transcriptomics have been increasingly used in plant science to reveal intricate biological relationships at the single-cell level. However, the application of single-cell technologies to plants is more limited due to the challenges posed by cell structure. This review outlines the advancements in single-cell omics technologies, their implications in plant systems, future research applications, and the challenges of single-cell omics in plant systems.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide Studies Reveal Genetic Risk Factors for Hepatic Fat Content. 全基因组研究揭示肝脏脂肪含量的遗传风险因素
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae031
Yanni Li, Eline H van den Berg, Alexander Kurilshikov, Dasha V Zhernakova, Ranko Gacesa, Shixian Hu, Esteban A Lopera-Maya, Alexandra Zhernakova, Vincent E de Meijer, Serena Sanna, Robin P F Dullaart, Hans Blokzijl, Eleonora A M Festen, Jingyuan Fu, Rinse K Weersma

Genetic susceptibility to metabolic associated fatty liver disease (MAFLD) is complex and poorly characterized. Accurate characterization of the genetic background of hepatic fat content would provide insights into disease etiology and causality of risk factors. We performed genome-wide association study (GWAS) on two noninvasive definitions of hepatic fat content: magnetic resonance imaging proton density fat fraction (MRI-PDFF) in 16,050 participants and fatty liver index (FLI) in 388,701 participants from the United Kingdom (UK) Biobank (UKBB). Heritability, genetic overlap, and similarity between hepatic fat content phenotypes were analyzed, and replicated in 10,398 participants from the University Medical Center Groningen (UMCG) Genetics Lifelines Initiative (UGLI). Meta-analysis of GWASs of MRI-PDFF in UKBB revealed five statistically significant loci, including two novel genomic loci harboring CREB3L1 (rs72910057-T, P = 5.40E-09) and GCM1 (rs1491489378-T, P = 3.16E-09), respectively, as well as three previously reported loci: PNPLA3, TM6SF2, and APOE. GWAS of FLI in UKBB identified 196 genome-wide significant loci, of which 49 were replicated in UGLI, with top signals in ZPR1 (P = 3.35E-13) and FTO (P = 2.11E-09). Statistically significant genetic correlation (rg) between MRI-PDFF (UKBB) and FLI (UGLI) GWAS results was found (rg = 0.5276, P = 1.45E-03). Novel MRI-PDFF genetic signals (CREB3L1 and GCM1) were replicated in the FLI GWAS. We identified two novel genes for MRI-PDFF and 49 replicable loci for FLI. Despite a difference in hepatic fat content assessment between MRI-PDFF and FLI, a substantial similar genetic architecture was found. FLI is identified as an easy and reliable approach to study hepatic fat content at the population level.

代谢相关性脂肪肝(MAFLD)的遗传易感性复杂且特征不清。准确描述肝脏脂肪含量的遗传背景将有助于深入了解疾病的病因和风险因素的因果关系。我们对肝脏脂肪含量的两种无创定义进行了全基因组关联研究(GWAS):磁共振成像质子密度脂肪分数(MRI-PDFF)(16,050 名参与者)和脂肪肝指数(FLI)(388,701 名来自英国生物库(UKBB)的参与者)。对肝脏脂肪含量表型之间的遗传性、遗传重叠和相似性进行了分析,并在格罗宁根大学医学中心(UMCG)遗传学生命线倡议(UGLI)的 10,398 名参与者中进行了复制。对UKBB中MRI-PDFF的GWAS进行元分析,发现了5个具有统计学意义的基因位点,包括两个新的基因组位点,分别是CREB3L1(rs72910057-T,P=5.40E-09)和GCM1(rs1491489378-T,P=3.16E-09),以及3个以前报道过的基因位点:PNPLA3、TM6SF2 和 APOE。对UKBB的FLI进行的GWAS发现了196个全基因组显著位点,其中49个在UGLI中得到了复制,ZPR1(P = 3.35E-13)和FTO(P = 2.11E-09)的信号最强。MRI-PDFF(UKBB)和 FLI(UGLI)的 GWAS 结果之间存在统计学意义上的遗传相关性(rg)(rg = 0.5276,P = 1.45E-03)。新的 MRI-PDFF 遗传信号(CREB3L1 和 GCM1)在 FLI GWAS 中得到了复制。我们为 MRI-PDFF 确定了两个新基因,为 FLI 确定了 49 个可复制的基因位点。尽管 MRI-PDFF 和 FLI 在肝脏脂肪含量评估方面存在差异,但却发现了非常相似的遗传结构。FLI 被认为是在人群水平上研究肝脏脂肪含量的一种简单可靠的方法。
{"title":"Genome-wide Studies Reveal Genetic Risk Factors for Hepatic Fat Content.","authors":"Yanni Li, Eline H van den Berg, Alexander Kurilshikov, Dasha V Zhernakova, Ranko Gacesa, Shixian Hu, Esteban A Lopera-Maya, Alexandra Zhernakova, Vincent E de Meijer, Serena Sanna, Robin P F Dullaart, Hans Blokzijl, Eleonora A M Festen, Jingyuan Fu, Rinse K Weersma","doi":"10.1093/gpbjnl/qzae031","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae031","url":null,"abstract":"<p><p>Genetic susceptibility to metabolic associated fatty liver disease (MAFLD) is complex and poorly characterized. Accurate characterization of the genetic background of hepatic fat content would provide insights into disease etiology and causality of risk factors. We performed genome-wide association study (GWAS) on two noninvasive definitions of hepatic fat content: magnetic resonance imaging proton density fat fraction (MRI-PDFF) in 16,050 participants and fatty liver index (FLI) in 388,701 participants from the United Kingdom (UK) Biobank (UKBB). Heritability, genetic overlap, and similarity between hepatic fat content phenotypes were analyzed, and replicated in 10,398 participants from the University Medical Center Groningen (UMCG) Genetics Lifelines Initiative (UGLI). Meta-analysis of GWASs of MRI-PDFF in UKBB revealed five statistically significant loci, including two novel genomic loci harboring CREB3L1 (rs72910057-T, P = 5.40E-09) and GCM1 (rs1491489378-T, P = 3.16E-09), respectively, as well as three previously reported loci: PNPLA3, TM6SF2, and APOE. GWAS of FLI in UKBB identified 196 genome-wide significant loci, of which 49 were replicated in UGLI, with top signals in ZPR1 (P = 3.35E-13) and FTO (P = 2.11E-09). Statistically significant genetic correlation (rg) between MRI-PDFF (UKBB) and FLI (UGLI) GWAS results was found (rg = 0.5276, P = 1.45E-03). Novel MRI-PDFF genetic signals (CREB3L1 and GCM1) were replicated in the FLI GWAS. We identified two novel genes for MRI-PDFF and 49 replicable loci for FLI. Despite a difference in hepatic fat content assessment between MRI-PDFF and FLI, a substantial similar genetic architecture was found. FLI is identified as an easy and reliable approach to study hepatic fat content at the population level.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: m6A Profile Dynamics Indicates Regulation of Oyster Development by m6A-RNA Epitranscriptomes. 更正:m6A-RNA 表转录组对牡蛎发育的调控显示了 m6A 配置文件的动态变化。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae021
{"title":"Correction to: m6A Profile Dynamics Indicates Regulation of Oyster Development by m6A-RNA Epitranscriptomes.","authors":"","doi":"10.1093/gpbjnl/qzae021","DOIUrl":"10.1093/gpbjnl/qzae021","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11233143/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Single-cell RNA Sequencing Reveals Sexually Dimorphic Transcriptome and Type 2 Diabetes Genes in Mouse Islet β Cells. 更正:单细胞 RNA 测序揭示了小鼠胰岛 β 细胞中的性别二态转录组和 2 型糖尿病基因。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae022
{"title":"Correction to: Single-cell RNA Sequencing Reveals Sexually Dimorphic Transcriptome and Type 2 Diabetes Genes in Mouse Islet β Cells.","authors":"","doi":"10.1093/gpbjnl/qzae022","DOIUrl":"10.1093/gpbjnl/qzae022","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11233144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSAlign: A Library for Nucleotide Sequence Alignment. BSAlign:核苷酸序列比对库。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae025
Haojing Shao, Jue Ruan

Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.

提高核苷酸序列比对的准确性是基因组学研究中的一个重要问题。虽然经典的动态编程(DP)算法(如 Smith-Waterman 和 Needleman-Wunsch)能保证产生最优结果,但其时间复杂性阻碍了大规模序列比对的应用。许多旨在加速序列比对过程的优化方法一般来自三个方面:重新设计数据结构[如对角线式或条带式单指令多数据(SIMD)实现]、增加 SIMD 操作的并行次数(如差分递推关系)或缩小搜索空间(如带状 DP)。然而,还没有一种方法能将这三个方面结合起来,从而建立一种超快算法。在这项研究中,我们开发了带状条带对齐器(BSAlign)库,通过将一系列新方法编织在一起,利用上述三个方面的优势,如带状矢量化中的主动 F 循环和带状 DP 中的带状移动,以超高速提供精确的对齐结果。我们将新的加速设计应用于常规配对和编辑距离配对。与其他基于 SIMD 的实现相比,BSAlign 的常规配对速度提高了 2 倍,在基于编辑距离的实现中,BSAlign 的长读取速度提高了 1.5 倍到 4 倍。BSAlign 是用 C 语言实现的,可在 https://github.com/ruanjue/bsalign 上查阅。
{"title":"BSAlign: A Library for Nucleotide Sequence Alignment.","authors":"Haojing Shao, Jue Ruan","doi":"10.1093/gpbjnl/qzae025","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae025","url":null,"abstract":"<p><p>Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142116457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics, proteomics & bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1