首页 > 最新文献

Genomics, proteomics & bioinformatics最新文献

英文 中文
GenBase: A Nucleotide Sequence Database. GenBase:核苷酸序列数据库。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae047
Congfan Bu, Xinchang Zheng, Xuetong Zhao, Tianyi Xu, Xue Bai, Yaokai Jia, Meili Chen, Lili Hao, Jingfa Xiao, Zhang Zhang, Wenming Zhao, Bixia Tang, Yiming Bao

The rapid advancement of sequencing technologies poses challenges in managing the large volume and exponential growth of sequence data efficiently and on time. To address this issue, we present GenBase (https://ngdc.cncb.ac.cn/genbase), an open-access data repository that follows the International Nucleotide Sequence Database Collaboration (INSDC) data standards and structures, for efficient nucleotide sequence archiving, searching, and sharing. As a core resource within the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GenBase offers bilingual submission pipeline and services, as well as local submission assistance in China. GenBase also provides a unique Excel format for metadata description and feature annotation of nucleotide sequences, along with a real-time data validation system to streamline sequence submissions. As of April 23, 2024, GenBase received 68,251 nucleotide sequences and 689,574 annotated protein sequences across 414 species from 2319 submissions. Out of these, 63,614 (93%) nucleotide sequences and 620,640 (90%) annotated protein sequences have been released and are publicly accessible through GenBase's web search system, File Transfer Protocol (FTP), and Application Programming Interface (API). Additionally, in collaboration with INSDC, GenBase has constructed an effective data exchange mechanism with GenBank and started sharing released nucleotide sequences. Furthermore, GenBase integrates all sequences from GenBank with daily updates, demonstrating its commitment to actively contributing to global sequence data management and sharing.

测序技术的飞速发展给高效及时地管理大量指数级增长的序列数据带来了挑战。为了解决这个问题,我们提出了GenBase(https://ngdc.cncb.ac.cn/genbase),一个遵循国际核苷酸序列数据库合作组织(INSDC)数据标准和结构的开放存取的数据资源库,用于高效的核苷酸序列归档、搜索和共享。作为中国国家生物信息中心(CNCB; https://ngdc.cncb.ac.cn)国家基因组学数据中心(NGDC)的核心资源,GenBase提供双语提交管道和服务,以及中国本地的提交协助。GenBase 还提供独特的 Excel 格式,用于核苷酸序列的元数据描述和特征注释,以及实时数据验证系统,以简化序列提交流程。截至2024年4月23日,GenBase共收到来自2319个提交的414个物种的68,251个核苷酸序列和689,574个注释蛋白质序列。其中,63,614条(93%)核苷酸序列和620,640条(90%)注释蛋白质序列已经发布,并可通过GenBase的网络搜索系统、文件传输协议(FTP)和应用编程接口(API)公开访问。此外,GenBase 还与 INSDC 合作,与 GenBank 建立了有效的数据交换机制,开始共享已发布的核苷酸序列。此外,GenBase 还整合了 GenBank 中的所有序列,并每日进行更新,这表明 GenBase 致力于为全球序列数据管理和共享做出积极贡献。
{"title":"GenBase: A Nucleotide Sequence Database.","authors":"Congfan Bu, Xinchang Zheng, Xuetong Zhao, Tianyi Xu, Xue Bai, Yaokai Jia, Meili Chen, Lili Hao, Jingfa Xiao, Zhang Zhang, Wenming Zhao, Bixia Tang, Yiming Bao","doi":"10.1093/gpbjnl/qzae047","DOIUrl":"10.1093/gpbjnl/qzae047","url":null,"abstract":"<p><p>The rapid advancement of sequencing technologies poses challenges in managing the large volume and exponential growth of sequence data efficiently and on time. To address this issue, we present GenBase (https://ngdc.cncb.ac.cn/genbase), an open-access data repository that follows the International Nucleotide Sequence Database Collaboration (INSDC) data standards and structures, for efficient nucleotide sequence archiving, searching, and sharing. As a core resource within the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GenBase offers bilingual submission pipeline and services, as well as local submission assistance in China. GenBase also provides a unique Excel format for metadata description and feature annotation of nucleotide sequences, along with a real-time data validation system to streamline sequence submissions. As of April 23, 2024, GenBase received 68,251 nucleotide sequences and 689,574 annotated protein sequences across 414 species from 2319 submissions. Out of these, 63,614 (93%) nucleotide sequences and 620,640 (90%) annotated protein sequences have been released and are publicly accessible through GenBase's web search system, File Transfer Protocol (FTP), and Application Programming Interface (API). Additionally, in collaboration with INSDC, GenBase has constructed an effective data exchange mechanism with GenBank and started sharing released nucleotide sequences. Furthermore, GenBase integrates all sequences from GenBank with daily updates, demonstrating its commitment to actively contributing to global sequence data management and sharing.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11434157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for AI Breeding in Plants. 植物人工智能育种的机器学习。
Pub Date : 2024-09-13 DOI: 10.1093/gpbjnl/qzae051
Qian Cheng, Xiangfeng Wang
{"title":"Machine Learning for AI Breeding in Plants.","authors":"Qian Cheng, Xiangfeng Wang","doi":"10.1093/gpbjnl/qzae051","DOIUrl":"10.1093/gpbjnl/qzae051","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11479635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish. 不同亚基因组的变异和相互作用促成了杂交鱼类的生长多样性。
Pub Date : 2024-07-23 DOI: 10.1093/gpbjnl/qzae055
Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu

Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing (RNA-seq) analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulated genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with growth rate, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing patterns in the expression of slc2a12 in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.31% of alleles, we observed dominant trans-regulatory effects in the regulatory interaction between distinct alleles from subgenomes R and C. Integrating analyses of allelic-specific expression and DNA methylation data revealed that the influence of DNA methylation on both subgenomes shapes the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interaction of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allele traits in animals.

等位基因和非等位基因间的杂交极大地改变了等位基因和非等位基因间的调控相互作用。然而,它们对动物生长多样性的影响仍然知之甚少。在这项研究中,我们对金鱼(Carassius auratus red var.)和鲤鱼(Cyprinus carpio)属间杂交产生的不同杂交品种进行了全基因组测序和 RNA 测序(RNA-seq)分析。这些杂交个体具有不同的线粒体基因组和拷贝数变异。通过加权基因相关网络分析,我们发现了 3693 个候选生长调控基因。其中,R亚基因组(源自金鱼)中3672个基因的表达与生长速度呈负相关,而C亚基因组(源自鲤鱼)中20个基因的表达与生长速度呈正相关。值得注意的是,我们观察到 C 亚基因组中 slc2a12 的表达呈现出耐人寻味的模式,它与体重的相关性与水温的变化相反,这表明杂交动物的摄食活动与体重增加之间存在不同的相互作用,以应对季节变化。综合分析等位基因特异性表达和DNA甲基化数据发现,DNA甲基化对两个亚基因组的影响决定了等位基因表达对生长率的相对贡献。这些发现为了解不同亚基因组之间的相互作用提供了新的视角,而这种相互作用是生长性状异质性的基础,有助于更好地理解动物的多等位基因性状。
{"title":"Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish.","authors":"Li Ren, Mengxue Luo, Jialin Cui, Xin Gao, Hong Zhang, Ping Wu, Zehong Wei, Yakui Tai, Mengdan Li, Kaikun Luo, Shaojun Liu","doi":"10.1093/gpbjnl/qzae055","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae055","url":null,"abstract":"<p><p>Intergeneric hybridization greatly reshapes regulatory interactions among allelic and non-allelic genes. However, their effects on growth diversity remain poorly understood in animals. In this study, we conducted whole-genome sequencing and RNA sequencing (RNA-seq) analyses in diverse hybrid varieties resulting from the intergeneric hybridization of goldfish (Carassius auratus red var.) and common carp (Cyprinus carpio). These hybrid individuals were characterized by distinct mitochondrial genomes and copy number variations. Through a weighted gene correlation network analysis, we identified 3693 genes as candidate growth-regulated genes. Among them, the expression of 3672 genes in subgenome R (originating from goldfish) displayed negative correlations with growth rate, whereas 20 genes in subgenome C (originating from common carp) exhibited positive correlations. Notably, we observed intriguing patterns in the expression of slc2a12 in subgenome C, showing opposite correlations with body weight that changed with water temperatures, suggesting differential interactions between feeding activity and weight gain in response to seasonal changes for hybrid animals. In 40.31% of alleles, we observed dominant trans-regulatory effects in the regulatory interaction between distinct alleles from subgenomes R and C. Integrating analyses of allelic-specific expression and DNA methylation data revealed that the influence of DNA methylation on both subgenomes shapes the relative contribution of allelic expression to the growth rate. These findings provide novel insights into the interaction of distinct subgenomes that underlie heterosis in growth traits and contribute to a better understanding of multiple allele traits in animals.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opportunities and Challenges in Advancing Plant Research with Single-cell Omics. 利用单细胞组学推进植物研究的机遇与挑战。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae026
Mohammad Saidur Rhaman, Muhammad Ali, Wenxiu Ye, Bosheng Li

Plants possess diverse cell types and intricate regulatory mechanisms to adapt to the ever-changing environment of nature. Various strategies have been employed to study cell types and their developmental progressions, including single-cell sequencing methods which provide high-dimensional catalogs to address biological concerns. In recent years, single-cell sequencing technologies in transcriptomics, epigenomics, proteomics, metabolomics, and spatial transcriptomics have been increasingly used in plant science to reveal intricate biological relationships at the single-cell level. However, the application of single-cell technologies to plants is more limited due to the challenges posed by cell structure. This review outlines the advancements in single-cell omics technologies, their implications in plant systems, future research applications, and the challenges of single-cell omics in plant systems.

植物拥有多种细胞类型和复杂的调节机制,以适应不断变化的自然环境。人们采用了各种策略来研究细胞类型及其发育过程,其中包括单细胞测序方法,这种方法可提供高维目录来解决生物学问题。近年来,转录组学、表观基因组学、蛋白质组学、代谢组学和空间转录组学等单细胞测序技术越来越多地应用于植物科学,以揭示单细胞水平上错综复杂的生物学关系。然而,由于细胞结构所带来的挑战,单细胞技术在植物中的应用较为有限。本综述概述了单细胞全息技术的进展、其对植物系统的影响、未来的研究应用以及单细胞全息技术在植物系统中的挑战。
{"title":"Opportunities and Challenges in Advancing Plant Research with Single-cell Omics.","authors":"Mohammad Saidur Rhaman, Muhammad Ali, Wenxiu Ye, Bosheng Li","doi":"10.1093/gpbjnl/qzae026","DOIUrl":"10.1093/gpbjnl/qzae026","url":null,"abstract":"<p><p>Plants possess diverse cell types and intricate regulatory mechanisms to adapt to the ever-changing environment of nature. Various strategies have been employed to study cell types and their developmental progressions, including single-cell sequencing methods which provide high-dimensional catalogs to address biological concerns. In recent years, single-cell sequencing technologies in transcriptomics, epigenomics, proteomics, metabolomics, and spatial transcriptomics have been increasingly used in plant science to reveal intricate biological relationships at the single-cell level. However, the application of single-cell technologies to plants is more limited due to the challenges posed by cell structure. This review outlines the advancements in single-cell omics technologies, their implications in plant systems, future research applications, and the challenges of single-cell omics in plant systems.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide Studies Reveal Genetic Risk Factors for Hepatic Fat Content. 全基因组研究揭示肝脏脂肪含量的遗传风险因素
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae031
Yanni Li, Eline H van den Berg, Alexander Kurilshikov, Dasha V Zhernakova, Ranko Gacesa, Shixian Hu, Esteban A Lopera-Maya, Alexandra Zhernakova, Vincent E de Meijer, Serena Sanna, Robin P F Dullaart, Hans Blokzijl, Eleonora A M Festen, Jingyuan Fu, Rinse K Weersma

Genetic susceptibility to metabolic associated fatty liver disease (MAFLD) is complex and poorly characterized. Accurate characterization of the genetic background of hepatic fat content would provide insights into disease etiology and causality of risk factors. We performed genome-wide association study (GWAS) on two noninvasive definitions of hepatic fat content: magnetic resonance imaging proton density fat fraction (MRI-PDFF) in 16,050 participants and fatty liver index (FLI) in 388,701 participants from the United Kingdom (UK) Biobank (UKBB). Heritability, genetic overlap, and similarity between hepatic fat content phenotypes were analyzed, and replicated in 10,398 participants from the University Medical Center Groningen (UMCG) Genetics Lifelines Initiative (UGLI). Meta-analysis of GWASs of MRI-PDFF in UKBB revealed five statistically significant loci, including two novel genomic loci harboring CREB3L1 (rs72910057-T, P = 5.40E-09) and GCM1 (rs1491489378-T, P = 3.16E-09), respectively, as well as three previously reported loci: PNPLA3, TM6SF2, and APOE. GWAS of FLI in UKBB identified 196 genome-wide significant loci, of which 49 were replicated in UGLI, with top signals in ZPR1 (P = 3.35E-13) and FTO (P = 2.11E-09). Statistically significant genetic correlation (rg) between MRI-PDFF (UKBB) and FLI (UGLI) GWAS results was found (rg = 0.5276, P = 1.45E-03). Novel MRI-PDFF genetic signals (CREB3L1 and GCM1) were replicated in the FLI GWAS. We identified two novel genes for MRI-PDFF and 49 replicable loci for FLI. Despite a difference in hepatic fat content assessment between MRI-PDFF and FLI, a substantial similar genetic architecture was found. FLI is identified as an easy and reliable approach to study hepatic fat content at the population level.

代谢相关性脂肪肝(MAFLD)的遗传易感性复杂且特征不清。准确描述肝脏脂肪含量的遗传背景将有助于深入了解疾病的病因和风险因素的因果关系。我们对肝脏脂肪含量的两种无创定义进行了全基因组关联研究(GWAS):磁共振成像质子密度脂肪分数(MRI-PDFF)(16,050 名参与者)和脂肪肝指数(FLI)(388,701 名来自英国生物库(UKBB)的参与者)。对肝脏脂肪含量表型之间的遗传性、遗传重叠和相似性进行了分析,并在格罗宁根大学医学中心(UMCG)遗传学生命线倡议(UGLI)的 10,398 名参与者中进行了复制。对UKBB中MRI-PDFF的GWAS进行元分析,发现了5个具有统计学意义的基因位点,包括两个新的基因组位点,分别是CREB3L1(rs72910057-T,P=5.40E-09)和GCM1(rs1491489378-T,P=3.16E-09),以及3个以前报道过的基因位点:PNPLA3、TM6SF2 和 APOE。对UKBB的FLI进行的GWAS发现了196个全基因组显著位点,其中49个在UGLI中得到了复制,ZPR1(P = 3.35E-13)和FTO(P = 2.11E-09)的信号最强。MRI-PDFF(UKBB)和 FLI(UGLI)的 GWAS 结果之间存在统计学意义上的遗传相关性(rg)(rg = 0.5276,P = 1.45E-03)。新的 MRI-PDFF 遗传信号(CREB3L1 和 GCM1)在 FLI GWAS 中得到了复制。我们为 MRI-PDFF 确定了两个新基因,为 FLI 确定了 49 个可复制的基因位点。尽管 MRI-PDFF 和 FLI 在肝脏脂肪含量评估方面存在差异,但却发现了非常相似的遗传结构。FLI 被认为是在人群水平上研究肝脏脂肪含量的一种简单可靠的方法。
{"title":"Genome-wide Studies Reveal Genetic Risk Factors for Hepatic Fat Content.","authors":"Yanni Li, Eline H van den Berg, Alexander Kurilshikov, Dasha V Zhernakova, Ranko Gacesa, Shixian Hu, Esteban A Lopera-Maya, Alexandra Zhernakova, Vincent E de Meijer, Serena Sanna, Robin P F Dullaart, Hans Blokzijl, Eleonora A M Festen, Jingyuan Fu, Rinse K Weersma","doi":"10.1093/gpbjnl/qzae031","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae031","url":null,"abstract":"<p><p>Genetic susceptibility to metabolic associated fatty liver disease (MAFLD) is complex and poorly characterized. Accurate characterization of the genetic background of hepatic fat content would provide insights into disease etiology and causality of risk factors. We performed genome-wide association study (GWAS) on two noninvasive definitions of hepatic fat content: magnetic resonance imaging proton density fat fraction (MRI-PDFF) in 16,050 participants and fatty liver index (FLI) in 388,701 participants from the United Kingdom (UK) Biobank (UKBB). Heritability, genetic overlap, and similarity between hepatic fat content phenotypes were analyzed, and replicated in 10,398 participants from the University Medical Center Groningen (UMCG) Genetics Lifelines Initiative (UGLI). Meta-analysis of GWASs of MRI-PDFF in UKBB revealed five statistically significant loci, including two novel genomic loci harboring CREB3L1 (rs72910057-T, P = 5.40E-09) and GCM1 (rs1491489378-T, P = 3.16E-09), respectively, as well as three previously reported loci: PNPLA3, TM6SF2, and APOE. GWAS of FLI in UKBB identified 196 genome-wide significant loci, of which 49 were replicated in UGLI, with top signals in ZPR1 (P = 3.35E-13) and FTO (P = 2.11E-09). Statistically significant genetic correlation (rg) between MRI-PDFF (UKBB) and FLI (UGLI) GWAS results was found (rg = 0.5276, P = 1.45E-03). Novel MRI-PDFF genetic signals (CREB3L1 and GCM1) were replicated in the FLI GWAS. We identified two novel genes for MRI-PDFF and 49 replicable loci for FLI. Despite a difference in hepatic fat content assessment between MRI-PDFF and FLI, a substantial similar genetic architecture was found. FLI is identified as an easy and reliable approach to study hepatic fat content at the population level.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: m6A Profile Dynamics Indicates Regulation of Oyster Development by m6A-RNA Epitranscriptomes. 更正:m6A-RNA 表转录组对牡蛎发育的调控显示了 m6A 配置文件的动态变化。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae021
{"title":"Correction to: m6A Profile Dynamics Indicates Regulation of Oyster Development by m6A-RNA Epitranscriptomes.","authors":"","doi":"10.1093/gpbjnl/qzae021","DOIUrl":"10.1093/gpbjnl/qzae021","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11233143/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Single-cell RNA Sequencing Reveals Sexually Dimorphic Transcriptome and Type 2 Diabetes Genes in Mouse Islet β Cells. 更正:单细胞 RNA 测序揭示了小鼠胰岛 β 细胞中的性别二态转录组和 2 型糖尿病基因。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae022
{"title":"Correction to: Single-cell RNA Sequencing Reveals Sexually Dimorphic Transcriptome and Type 2 Diabetes Genes in Mouse Islet β Cells.","authors":"","doi":"10.1093/gpbjnl/qzae022","DOIUrl":"10.1093/gpbjnl/qzae022","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11233144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSAlign: A Library for Nucleotide Sequence Alignment. BSAlign:核苷酸序列比对库。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae025
Haojing Shao, Jue Ruan

Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.

提高核苷酸序列比对的准确性是基因组学研究中的一个重要问题。虽然经典的动态编程(DP)算法(如 Smith-Waterman 和 Needleman-Wunsch)能保证产生最优结果,但其时间复杂性阻碍了大规模序列比对的应用。许多旨在加速序列比对过程的优化方法一般来自三个方面:重新设计数据结构[如对角线式或条带式单指令多数据(SIMD)实现]、增加 SIMD 操作的并行次数(如差分递推关系)或缩小搜索空间(如带状 DP)。然而,还没有一种方法能将这三个方面结合起来,从而建立一种超快算法。在这项研究中,我们开发了带状条带对齐器(BSAlign)库,通过将一系列新方法编织在一起,利用上述三个方面的优势,如带状矢量化中的主动 F 循环和带状 DP 中的带状移动,以超高速提供精确的对齐结果。我们将新的加速设计应用于常规配对和编辑距离配对。与其他基于 SIMD 的实现相比,BSAlign 的常规配对速度提高了 2 倍,在基于编辑距离的实现中,BSAlign 的长读取速度提高了 1.5 倍到 4 倍。BSAlign 是用 C 语言实现的,可在 https://github.com/ruanjue/bsalign 上查阅。
{"title":"BSAlign: A Library for Nucleotide Sequence Alignment.","authors":"Haojing Shao, Jue Ruan","doi":"10.1093/gpbjnl/qzae025","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae025","url":null,"abstract":"<p><p>Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142116457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DiffGR: Detecting Differentially Interacting Genomic Regions from Hi-C Contact Maps. DiffGR:从 Hi-C 接触图中检测差异交互基因组区域。
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae028
Huiling Liu, Wenxiu Ma

Recent advances in high-throughput chromosome conformation capture (Hi-C) techniques have allowed us to map genome-wide chromatin interactions and uncover higher-order chromatin structures, thereby shedding light on the principles of genome architecture and functions. However, statistical methods for detecting changes in large-scale chromatin organization such as topologically associating domains (TADs) are still lacking. Here, we proposed a new statistical method, DiffGR, for detecting differentially interacting genomic regions at the TAD level between Hi-C contact maps. We utilized the stratum-adjusted correlation coefficient to measure similarity of local TAD regions. We then developed a nonparametric approach to identify statistically significant changes of genomic interacting regions. Through simulation studies, we demonstrated that DiffGR can robustly and effectively discover differential genomic regions under various conditions. Furthermore, we successfully revealed cell type-specific changes in genomic interacting regions in both human and mouse Hi-C datasets, and illustrated that DiffGR yielded consistent and advantageous results compared with state-of-the-art differential TAD detection methods. The DiffGR R package is published under the GNU General Public License (GPL) ≥ 2 license and is publicly available at https://github.com/wmalab/DiffGR.

高通量染色体构象捕获(Hi-C)技术的最新进展使我们能够绘制全基因组染色质相互作用图谱,揭示高阶染色质结构,从而揭示基因组结构和功能的原理。然而,检测大规模染色质组织变化(如拓扑关联域(TAD))的统计方法仍然缺乏。在这里,我们提出了一种新的统计方法--DiffGR,用于检测 Hi-C 接触图之间在 TAD 水平上有不同相互作用的基因组区域。我们利用层调整相关系数来衡量局部 TAD 区域的相似性。然后,我们开发了一种非参数方法来识别基因组相互作用区域在统计学上的显著变化。通过模拟研究,我们证明了 DiffGR 可以在各种条件下稳健有效地发现差异基因组区域。此外,我们还成功揭示了人类和小鼠 Hi-C 数据集中基因组相互作用区域的细胞类型特异性变化,并说明与最先进的差异 TAD 检测方法相比,DiffGR 能产生一致且有利的结果。DiffGR R软件包在GNU通用公共许可证(GPL)≥2许可证下发布,可在https://github.com/wmalab/DiffGR 公开获取。
{"title":"DiffGR: Detecting Differentially Interacting Genomic Regions from Hi-C Contact Maps.","authors":"Huiling Liu, Wenxiu Ma","doi":"10.1093/gpbjnl/qzae028","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae028","url":null,"abstract":"<p><p>Recent advances in high-throughput chromosome conformation capture (Hi-C) techniques have allowed us to map genome-wide chromatin interactions and uncover higher-order chromatin structures, thereby shedding light on the principles of genome architecture and functions. However, statistical methods for detecting changes in large-scale chromatin organization such as topologically associating domains (TADs) are still lacking. Here, we proposed a new statistical method, DiffGR, for detecting differentially interacting genomic regions at the TAD level between Hi-C contact maps. We utilized the stratum-adjusted correlation coefficient to measure similarity of local TAD regions. We then developed a nonparametric approach to identify statistically significant changes of genomic interacting regions. Through simulation studies, we demonstrated that DiffGR can robustly and effectively discover differential genomic regions under various conditions. Furthermore, we successfully revealed cell type-specific changes in genomic interacting regions in both human and mouse Hi-C datasets, and illustrated that DiffGR yielded consistent and advantageous results compared with state-of-the-art differential TAD detection methods. The DiffGR R package is published under the GNU General Public License (GPL) ≥ 2 license and is publicly available at https://github.com/wmalab/DiffGR.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proteomic Stratification of Prognosis and Treatment Options for Small Cell Lung Cancer. 小细胞肺癌预后和治疗方案的蛋白质组学分层
Pub Date : 2024-07-03 DOI: 10.1093/gpbjnl/qzae033
Zitian Huo, Yaqi Duan, Dongdong Zhan, Xizhen Xu, Nairen Zheng, Jing Cai, Ruifang Sun, Jianping Wang, Fang Cheng, Zhan Gao, Caixia Xu, Wanlin Liu, Yuting Dong, Sailong Ma, Qian Zhang, Yiyun Zheng, Liping Lou, Dong Kuang, Qian Chu, Jun Qin, Guoping Wang, Yi Wang

Small cell lung cancer (SCLC) is a highly malignant and heterogeneous cancer with limited therapeutic options and prognosis prediction models. Here, we analyzed formalin-fixed, paraffin-embedded (FFPE) samples of surgical resections by proteomic profiling, and stratified SCLC into three proteomic subtypes (S-I, S-II, and S-III) with distinct clinical outcomes and chemotherapy responses. The proteomic subtyping was an independent prognostic factor and performed better than current tumor-node-metastasis or Veterans Administration Lung Study Group staging methods. The subtyping results could be further validated using FFPE biopsy samples from an independent cohort, extending the analysis to both surgical and biopsy samples. The signatures of the S-II subtype in particular suggested potential benefits from immunotherapy. Differentially overexpressed proteins in S-III, the worst prognostic subtype, allowed us to nominate potential therapeutic targets, indicating that patient selection may bring new hope for previously failed clinical trials. Finally, analysis of an independent cohort of SCLC patients who had received immunotherapy validated the prediction that the S-II patients had better progression-free survival and overall survival after first-line immunotherapy. Collectively, our study provides the rationale for future clinical investigations to validate the current findings for more accurate prognosis prediction and precise treatments.

小细胞肺癌(SCLC)是一种高度恶性的异质性癌症,治疗方案和预后预测模型都很有限。在这里,我们通过蛋白质组学分析对福尔马林固定、石蜡包埋(FFPE)的手术切除样本进行了分析,并将小细胞肺癌分为三种蛋白质组学亚型(S-I、S-II 和 S-III),其临床预后和化疗反应各不相同。蛋白质组亚型是一个独立的预后因素,其效果优于目前的肿瘤-结节-转移或退伍军人管理局肺研究小组分期方法。亚型分析结果可通过使用来自一个独立队列的FFPE活检样本进一步验证,从而将分析范围扩大到手术样本和活检样本。特别是S-II亚型的特征表明,免疫疗法有可能带来益处。S-III亚型是预后最差的亚型,其不同程度的蛋白过表达使我们能够确定潜在的治疗靶点,这表明患者的选择可能会为之前失败的临床试验带来新的希望。最后,对接受过免疫治疗的独立 SCLC 患者队列的分析验证了 S-II 患者在接受一线免疫治疗后无进展生存期和总生存期更长的预测。总之,我们的研究为未来的临床研究提供了理论依据,以验证目前的研究结果,从而获得更准确的预后预测和精确治疗。
{"title":"Proteomic Stratification of Prognosis and Treatment Options for Small Cell Lung Cancer.","authors":"Zitian Huo, Yaqi Duan, Dongdong Zhan, Xizhen Xu, Nairen Zheng, Jing Cai, Ruifang Sun, Jianping Wang, Fang Cheng, Zhan Gao, Caixia Xu, Wanlin Liu, Yuting Dong, Sailong Ma, Qian Zhang, Yiyun Zheng, Liping Lou, Dong Kuang, Qian Chu, Jun Qin, Guoping Wang, Yi Wang","doi":"10.1093/gpbjnl/qzae033","DOIUrl":"10.1093/gpbjnl/qzae033","url":null,"abstract":"<p><p>Small cell lung cancer (SCLC) is a highly malignant and heterogeneous cancer with limited therapeutic options and prognosis prediction models. Here, we analyzed formalin-fixed, paraffin-embedded (FFPE) samples of surgical resections by proteomic profiling, and stratified SCLC into three proteomic subtypes (S-I, S-II, and S-III) with distinct clinical outcomes and chemotherapy responses. The proteomic subtyping was an independent prognostic factor and performed better than current tumor-node-metastasis or Veterans Administration Lung Study Group staging methods. The subtyping results could be further validated using FFPE biopsy samples from an independent cohort, extending the analysis to both surgical and biopsy samples. The signatures of the S-II subtype in particular suggested potential benefits from immunotherapy. Differentially overexpressed proteins in S-III, the worst prognostic subtype, allowed us to nominate potential therapeutic targets, indicating that patient selection may bring new hope for previously failed clinical trials. Finally, analysis of an independent cohort of SCLC patients who had received immunotherapy validated the prediction that the S-II patients had better progression-free survival and overall survival after first-line immunotherapy. Collectively, our study provides the rationale for future clinical investigations to validate the current findings for more accurate prognosis prediction and precise treatments.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":"22 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics, proteomics & bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1