Briefings in Functional Genomics最新文献_第6页

Long-read RNA sequencing can probe organelle genome pervasive transcription. 长读 RNA 测序可探测细胞器基因组的普遍转录。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae026

Matheus Sanita Lima, Douglas Silva Domingues, Alexandre Rossi Paschoal, David Roy Smith

40 years ago, organelle genomes were assumed to be streamlined and, perhaps, unexciting remnants of their prokaryotic past. However, the field of organelle genomics has exposed an unparallel diversity in genome architecture (i.e. genome size, structure, and content). The transcription of these eccentric genomes can be just as elaborate - organelle genomes are pervasively transcribed into a plethora of RNA types. However, while organelle protein-coding genes are known to produce polycistronic transcripts that undergo heavy posttranscriptional processing, the nature of organelle noncoding transcriptomes is still poorly resolved. Here, we review how wet-lab experiments and second-generation sequencing data (i.e. short reads) have been useful to determine certain types of organelle RNAs, particularly noncoding RNAs. We then explain how third-generation (long-read) RNA-Seq data represent the new frontier in organelle transcriptomics. We show that public repositories (e.g. NCBI SRA) already contain enough data for inter-phyla comparative studies and argue that organelle biologists can benefit from such data. We discuss the prospects of using publicly available sequencing data for organelle-focused studies and examine the challenges of such an approach. We highlight that the lack of a comprehensive database dedicated to organelle genomics/transcriptomics is a major impediment to the development of a field with implications in basic and applied science.

40 年前，人们认为细胞器基因组是精简的，也许是原核生物过去遗留下来的不令人兴奋的基因组。然而，细胞器基因组学领域揭示了基因组结构（即基因组大小、结构和内容）的无与伦比的多样性。这些古怪基因组的转录也同样复杂--细胞器基因组普遍转录为大量 RNA 类型。然而，虽然已知细胞器蛋白编码基因会产生经过大量转录后处理的多聚转录本，但细胞器非编码转录本组的性质仍未得到很好的解决。在此，我们回顾了湿实验室实验和第二代测序数据（即短读数）是如何帮助确定某些类型的细胞器 RNA，尤其是非编码 RNA 的。然后，我们解释了第三代（长读数）RNA-Seq 数据如何代表细胞器转录组学的新前沿。我们表明，公共资源库（如 NCBI SRA）已包含足够的数据用于系统间比较研究，并认为细胞器生物学家可以从这些数据中获益。我们讨论了将公开可用的测序数据用于以细胞器为重点的研究的前景，并探讨了这种方法所面临的挑战。我们强调，缺乏一个专门用于细胞器基因组学/转录组学的综合数据库是这一领域发展的主要障碍，对基础科学和应用科学都有影响。

{"title":"Long-read RNA sequencing can probe organelle genome pervasive transcription.","authors":"Matheus Sanita Lima, Douglas Silva Domingues, Alexandre Rossi Paschoal, David Roy Smith","doi":"10.1093/bfgp/elae026","DOIUrl":"10.1093/bfgp/elae026","url":null,"abstract":"40 years ago, organelle genomes were assumed to be streamlined and, perhaps, unexciting remnants of their prokaryotic past. However, the field of organelle genomics has exposed an unparallel diversity in genome architecture (i.e. genome size, structure, and content). The transcription of these eccentric genomes can be just as elaborate - organelle genomes are pervasively transcribed into a plethora of RNA types. However, while organelle protein-coding genes are known to produce polycistronic transcripts that undergo heavy posttranscriptional processing, the nature of organelle noncoding transcriptomes is still poorly resolved. Here, we review how wet-lab experiments and second-generation sequencing data (i.e. short reads) have been useful to determine certain types of organelle RNAs, particularly noncoding RNAs. We then explain how third-generation (long-read) RNA-Seq data represent the new frontier in organelle transcriptomics. We show that public repositories (e.g. NCBI SRA) already contain enough data for inter-phyla comparative studies and argue that organelle biologists can benefit from such data. We discuss the prospects of using publicly available sequencing data for organelle-focused studies and examine the challenges of such an approach. We highlight that the lack of a comprehensive database dedicated to organelle genomics/transcriptomics is a major impediment to the development of a field with implications in basic and applied science.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"695-701"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141332590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prioritization of candidate genes for major QTLs governing yield traits employing integrated multi-omics approach in rice (Oryza sativa L.). 利用综合多组学方法对水稻（Oryza sativa L.）产量性状主要 QTL 候选基因进行优先排序。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae035

Issa Keerthi, Vishnu Shukla, Sudhamani Kalluru, Lal Ahamed Mohammad, P Lavanya Kumari, Eswarayya Ramireddy, Lakshminarayana R Vemireddy

Rapidly identifying candidate genes underlying major QTLs is crucial for improving rice (Oryza sativa L.). In this study, we developed a workflow to rapidly prioritize candidate genes underpinning 99 major QTLs governing yield component traits. This workflow integrates multiomics databases, including sequence variation, gene expression, gene ontology, co-expression analysis, and protein-protein interaction. We predicted 206 candidate genes for 99 reported QTLs governing ten economically important yield-contributing traits using this approach. Among these, transcription factors belonging to families of MADS-box, WRKY, helix-loop-helix, TCP, MYB, GRAS, auxin response factor, and nuclear transcription factor Y subunit were promising. Validation of key prioritized candidate genes in contrasting rice genotypes for sequence variation and differential expression identified Leucine-Rich Repeat family protein (LOC_Os03g28270) and cytochrome P450 (LOC_Os02g57290) as candidate genes for the major QTLs GL1 and pl2.1, which govern grain length and panicle length, respectively. In conclusion, this study demonstrates that our workflow can significantly narrow down a large number of annotated genes in a QTL to a very small number of the most probable candidates, achieving approximately a 21-fold reduction. These candidate genes have potential implications for enhancing rice yield.

快速鉴定主要 QTLs 的候选基因对于改良水稻（Oryza sativa L.）至关重要。在这项研究中，我们开发了一套工作流程，用于快速优先确定99个主要QTLs的候选基因。该工作流程整合了多组学数据库，包括序列变异、基因表达、基因本体、共表达分析和蛋白-蛋白相互作用。利用这种方法，我们预测了 99 个已报道 QTL 的 206 个候选基因，这些 QTL 控制着 10 个具有重要经济意义的产量贡献性状。其中，属于 MADS-box、WRKY、螺旋-环-螺旋、TCP、MYB、GRAS、辅助因子反应因子和核转录因子 Y 亚基家族的转录因子很有希望。在对比水稻基因型中验证关键优先候选基因的序列变异和差异表达，发现亮氨酸富重复家族蛋白（LOC_Os03g28270）和细胞色素 P450（LOC_Os02g57290）是主要 QTL GL1 和 pl2.1 的候选基因，这两个 QTL 分别控制谷粒长度和圆锥花序长度。总之，这项研究表明，我们的工作流程可以将 QTL 中的大量注释基因大幅缩小到极少数最可能的候选基因，减少了约 21 倍。这些候选基因对提高水稻产量具有潜在的意义。

{"title":"Prioritization of candidate genes for major QTLs governing yield traits employing integrated multi-omics approach in rice (Oryza sativa L.).","authors":"Issa Keerthi, Vishnu Shukla, Sudhamani Kalluru, Lal Ahamed Mohammad, P Lavanya Kumari, Eswarayya Ramireddy, Lakshminarayana R Vemireddy","doi":"10.1093/bfgp/elae035","DOIUrl":"10.1093/bfgp/elae035","url":null,"abstract":"Rapidly identifying candidate genes underlying major QTLs is crucial for improving rice (Oryza sativa L.). In this study, we developed a workflow to rapidly prioritize candidate genes underpinning 99 major QTLs governing yield component traits. This workflow integrates multiomics databases, including sequence variation, gene expression, gene ontology, co-expression analysis, and protein-protein interaction. We predicted 206 candidate genes for 99 reported QTLs governing ten economically important yield-contributing traits using this approach. Among these, transcription factors belonging to families of MADS-box, WRKY, helix-loop-helix, TCP, MYB, GRAS, auxin response factor, and nuclear transcription factor Y subunit were promising. Validation of key prioritized candidate genes in contrasting rice genotypes for sequence variation and differential expression identified Leucine-Rich Repeat family protein (LOC_Os03g28270) and cytochrome P450 (LOC_Os02g57290) as candidate genes for the major QTLs GL1 and pl2.1, which govern grain length and panicle length, respectively. In conclusion, this study demonstrates that our workflow can significantly narrow down a large number of annotated genes in a QTL to a very small number of the most probable candidates, achieving approximately a 21-fold reduction. These candidate genes have potential implications for enhancing rice yield.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"843-857"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142127426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Discoveries by the genome profiling, symbolic powers of non-next generation sequencing methods. 基因组剖析的发现，非下一代测序方法的象征性力量。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae047

Koichi Nishigaki

Next-generation sequencing and other sequencing approaches have made significant progress in DNA analysis. However, there are indispensable advantages in the nonsequencing methods. They have their justifications such as being speedy, cost-effective, multi-applicable, and straightforward. Among the nonsequencing methods, the genome profiling method is worthy of reviewing because of its high potential. This article first reviews its basic properties, highlights the key concept of species identification dots (spiddos), and then summarizes its various applications.

新一代测序和其他测序方法在 DNA 分析领域取得了重大进展。然而，非测序方法也有其不可或缺的优势。它们有其合理性，如速度快、成本效益高、适用范围广、简单明了等。在非测序方法中，基因组图谱分析法因其巨大潜力而值得研究。本文首先回顾了其基本特性，强调了物种识别点（spiddos）的关键概念，然后总结了其各种应用。

引用次数: 0

Genetic variation mining of the Chinese mitten crab (Eriocheir sinensis) based on transcriptome data from public databases. 基于公共数据库转录组数据的中华绒螯蟹遗传变异挖掘。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae030

Yuanfeng Xu, Fan Yu, Wenrong Feng, Jia Wei, Shengyan Su, Jianlin Li, Guoan Hua, Wenjing Li, Yongkai Tang

At present, public databases house an extensive repository of transcriptome data, with the volume continuing to grow at an accelerated pace. Utilizing these data effectively is a shared interest within the scientific community. In this study, we introduced a novel strategy that harnesses SNPs and InDels identified from transcriptome data, combined with sample metadata from databases, to effectively screen for molecular markers correlated with traits. We utilized 228 transcriptome datasets of Eriocheir sinensis from the NCBI database and employed the Genome Analysis Toolkit software to identify 96 388 SNPs and 20 645 InDels. Employing the genome-wide association study analysis, in conjunction with the gender information from databases, we identified 3456 sex-biased SNPs and 639 sex-biased InDels. The KOG and KEGG annotations of the sex-biased SNPs and InDels revealed that these genes were primarily involved in the metabolic processes of E. sinensis. Combined with SnpEff annotation and PCR experimental validation, a highly sex-biased SNP located in the Kelch domain containing 4 (Klhdc4) gene, CHR67-6415071, was found to alter the splicing sites of Klhdc4, generating two splice variants, Klhdc4_a and Klhdc4_b. Additionally, Klhdc4 exhibited robust expression across the ovaries, testes, and accessory glands. The sex-biased SNPs and InDels identified in this study are conducive to the development of unisexual cultivation methods for E. sinensis, and the alternative splicing event caused by the sex-biased SNP in Klhdc4 may serve as a potential mechanism for sex regulation in E. sinensis. The analysis strategy employed in this study represents a new direction for the rational exploitation and utilization of transcriptome data in public databases.

目前，公共数据库储存了大量转录组数据，而且数据量还在继续加速增长。有效利用这些数据是科学界的共同兴趣所在。在本研究中，我们引入了一种新策略，利用从转录组数据中识别出的 SNPs 和 InDels，结合数据库中的样本元数据，有效筛选出与性状相关的分子标记。我们利用NCBI数据库中的228个中华鳖转录组数据集，并使用基因组分析工具包软件鉴定了96 388个SNPs和20 645个InDels。通过全基因组关联研究分析，并结合数据库中的性别信息，我们确定了 3456 个性别偏倚 SNPs 和 639 个性别偏倚 InDels。性别偏倚 SNPs 和 InDels 的 KOG 和 KEGG 注释表明，这些基因主要参与中华鳖的代谢过程。结合 SnpEff 注释和 PCR 实验验证，发现位于 Kelch domain containing 4 (Klhdc4) 基因中的一个高度性别偏倚 SNP（CHR67-6415071）改变了 Klhdc4 的剪接位点，产生了两个剪接变体 Klhdc4_a 和 Klhdc4_b。此外，Klhdc4 在卵巢、睾丸和附属腺体中都有很强的表达。本研究发现的性别偏倚 SNPs 和 InDels 有助于开发中华鳖的单性栽培方法，而 Klhdc4 中的性别偏倚 SNP 引起的替代剪接事件可能是中华鳖性别调控的潜在机制。本研究采用的分析策略为合理开发和利用公共数据库中的转录组数据指明了新的方向。

{"title":"Genetic variation mining of the Chinese mitten crab (Eriocheir sinensis) based on transcriptome data from public databases.","authors":"Yuanfeng Xu, Fan Yu, Wenrong Feng, Jia Wei, Shengyan Su, Jianlin Li, Guoan Hua, Wenjing Li, Yongkai Tang","doi":"10.1093/bfgp/elae030","DOIUrl":"10.1093/bfgp/elae030","url":null,"abstract":"At present, public databases house an extensive repository of transcriptome data, with the volume continuing to grow at an accelerated pace. Utilizing these data effectively is a shared interest within the scientific community. In this study, we introduced a novel strategy that harnesses SNPs and InDels identified from transcriptome data, combined with sample metadata from databases, to effectively screen for molecular markers correlated with traits. We utilized 228 transcriptome datasets of Eriocheir sinensis from the NCBI database and employed the Genome Analysis Toolkit software to identify 96 388 SNPs and 20 645 InDels. Employing the genome-wide association study analysis, in conjunction with the gender information from databases, we identified 3456 sex-biased SNPs and 639 sex-biased InDels. The KOG and KEGG annotations of the sex-biased SNPs and InDels revealed that these genes were primarily involved in the metabolic processes of E. sinensis. Combined with SnpEff annotation and PCR experimental validation, a highly sex-biased SNP located in the Kelch domain containing 4 (Klhdc4) gene, CHR67-6415071, was found to alter the splicing sites of Klhdc4, generating two splice variants, Klhdc4_a and Klhdc4_b. Additionally, Klhdc4 exhibited robust expression across the ovaries, testes, and accessory glands. The sex-biased SNPs and InDels identified in this study are conducive to the development of unisexual cultivation methods for E. sinensis, and the alternative splicing event caused by the sex-biased SNP in Klhdc4 may serve as a potential mechanism for sex regulation in E. sinensis. The analysis strategy employed in this study represents a new direction for the rational exploitation and utilization of transcriptome data in public databases.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"816-827"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gene regulatory network inference based on novel ensemble method. 基于新型集合方法的基因调控网络推断。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae036

Bin Yang, Jing Li, Xiang Li, Sanrong Liu

Gene regulatory networks (GRNs) contribute toward understanding the function of genes and the development of cancer or the impact of key genes on diseases. Hence, this study proposes an ensemble method based on 13 basic classification methods and a flexible neural tree (FNT) to improve GRN identification accuracy. The primary classification methods contain ridge classification, stochastic gradient descent, Gaussian process classification, Bernoulli Naive Bayes, adaptive boosting, gradient boosting decision tree, hist gradient boosting classification, eXtreme gradient boosting (XGBoost), multilayer perceptron, light gradient boosting machine, random forest, support vector machine, and k-nearest neighbor algorithm, which are regarded as the input variable set of FNT model. Additionally, a hybrid evolutionary algorithm based on a gene programming variant and particle swarm optimization is developed to search for the optimal FNT model. Experiments on three simulation datasets and three real single-cell RNA-seq datasets demonstrate that the proposed ensemble feature outperforms 13 supervised algorithms, seven unsupervised algorithms (ARACNE, CLR, GENIE3, MRNET, PCACMI, GENECI, and EPCACMI) and four single cell-specific methods (SCODE, BiRGRN, LEAP, and BiGBoost) based on the area under the receiver operating characteristic curve, area under the precision-recall curve, and F1 metrics.

基因调控网络（GRN）有助于了解基因的功能、癌症的发展或关键基因对疾病的影响。因此，本研究提出了一种基于 13 种基本分类方法和灵活神经树（FNT）的集合方法，以提高 GRN 识别的准确性。主要分类方法包括脊分类、随机梯度下降、高斯过程分类、伯努利-奈维贝叶斯、自适应提升、梯度提升决策树、直方图梯度提升分类、极端梯度提升（XGBoost）、多层感知器、光梯度提升机、随机森林、支持向量机和 k 近邻算法，这些方法被视为 FNT 模型的输入变量集。此外，还开发了一种基于基因编程变体和粒子群优化的混合进化算法，用于搜索最佳 FNT 模型。在三个模拟数据集和三个真实单细胞RNA-seq数据集上的实验表明，根据接收者操作特征曲线下面积、精度-召回曲线下面积和F1指标，所提出的集合特征优于13种监督算法、7种无监督算法（ARACNE、CLR、GENIE3、MRNET、PCACMI、GENECI和EPCACMI）和4种单细胞特定方法（SCODE、BiRGRN、LEAP和BiGBoost）。

{"title":"Gene regulatory network inference based on novel ensemble method.","authors":"Bin Yang, Jing Li, Xiang Li, Sanrong Liu","doi":"10.1093/bfgp/elae036","DOIUrl":"10.1093/bfgp/elae036","url":null,"abstract":"Gene regulatory networks (GRNs) contribute toward understanding the function of genes and the development of cancer or the impact of key genes on diseases. Hence, this study proposes an ensemble method based on 13 basic classification methods and a flexible neural tree (FNT) to improve GRN identification accuracy. The primary classification methods contain ridge classification, stochastic gradient descent, Gaussian process classification, Bernoulli Naive Bayes, adaptive boosting, gradient boosting decision tree, hist gradient boosting classification, eXtreme gradient boosting (XGBoost), multilayer perceptron, light gradient boosting machine, random forest, support vector machine, and k-nearest neighbor algorithm, which are regarded as the input variable set of FNT model. Additionally, a hybrid evolutionary algorithm based on a gene programming variant and particle swarm optimization is developed to search for the optimal FNT model. Experiments on three simulation datasets and three real single-cell RNA-seq datasets demonstrate that the proposed ensemble feature outperforms 13 supervised algorithms, seven unsupervised algorithms (ARACNE, CLR, GENIE3, MRNET, PCACMI, GENECI, and EPCACMI) and four single cell-specific methods (SCODE, BiRGRN, LEAP, and BiGBoost) based on the area under the receiver operating characteristic curve, area under the precision-recall curve, and F1 metrics.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"866-878"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142332842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An overview of key online resources for human genomics: a powerful and open toolbox for in silico research. 人类基因组学主要在线资源概览：用于硅学研究的强大而开放的工具箱。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae029

Diego A Forero, Diego A Bonilla, Yeimy González-Giraldo, George P Patrinos

Recent advances in high-throughput molecular methods have led to an extraordinary volume of genomics data. Simultaneously, the progress in the computational implementation of novel algorithms has facilitated the creation of hundreds of freely available online tools for their advanced analyses. However, a general overview of the most commonly used tools for the in silico analysis of genomics data is still missing. In the current article, we present an overview of commonly used online resources for genomics research, including over 50 tools. This selection will be helpful for scientists with basic or intermediate skills in the in silico analyses of genomics data, such as researchers and students from wet labs seeking to strengthen their computational competencies. In addition, we discuss current needs and future perspectives within this field.

高通量分子方法的最新进展带来了大量的基因组学数据。与此同时，新算法的计算实施进展也促进了数百种免费在线工具的诞生，用于对这些数据进行高级分析。然而，目前仍缺少对基因组学数据硅学分析最常用工具的总体概述。在本文中，我们概述了基因组学研究中常用的在线资源，包括 50 多种工具。对于在基因组学数据的硅学分析方面具有基础或中级技能的科学家，如湿法实验室的研究人员和寻求加强计算能力的学生，这些精选的资源将有所帮助。此外，我们还讨论了这一领域的当前需求和未来前景。

引用次数: 0

A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data. 针对单细胞和空间转录组学数据的降维和聚类方法综合调查。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae023

Yidi Sun, Lingling Kong, Jiayi Huang, Hongyan Deng, Xinling Bian, Xingfeng Li, Feifei Cui, Lijun Dou, Chen Cao, Quan Zou, Zilong Zhang

In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.

近年来，单细胞转录组学和空间转录组学分析技术的应用越来越广泛。无论是处理单细胞转录组数据还是空间转录组数据，降维和聚类都是不可或缺的。单细胞和空间转录组数据通常都是高维数据，这使得对这类数据的分析和可视化具有挑战性。通过降维，就可以在低维空间中可视化数据，从而观察细胞亚群之间的关系和差异。聚类可将相似的细胞归入同一聚类，有助于识别不同的细胞亚群，揭示细胞的多样性，为下游分析提供指导。在这篇综述中，我们系统地总结了用于单细胞转录组和空间转录组数据降维和聚类分析的最广泛认可的算法。这项工作提供了宝贵的见解和想法，有助于在这个快速发展的领域开发新的工具。

{"title":"A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data.","authors":"Yidi Sun, Lingling Kong, Jiayi Huang, Hongyan Deng, Xinling Bian, Xingfeng Li, Feifei Cui, Lijun Dou, Chen Cao, Quan Zou, Zilong Zhang","doi":"10.1093/bfgp/elae023","DOIUrl":"10.1093/bfgp/elae023","url":null,"abstract":"In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"733-744"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141302188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Characterization of double-stranded RNA and its silencing efficiency for insects using hybrid deep-learning framework. 利用混合深度学习框架鉴定双链 RNA 及其对昆虫的沉默效率。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae027

Han Cheng, Liping Xu, Cangzhi Jia

RNA interference (RNAi) technology is widely used in the biological prevention and control of terrestrial insects. One of the main factors with the application of RNAi in insects is the difference in RNAi efficiency, which may vary not only in different insects, but also in different genes of the same insect, and even in different double-stranded RNAs (dsRNAs) of the same gene. This work focuses on the last question and establishes a bioinformatics software that can help researchers screen for the most efficient dsRNA targeting target genes. Among insects, the red flour beetle (Tribolium castaneum) is known to be one of the most sensitive to RNAi. From iBeetle-Base, we extracted 12 027 efficient dsRNA sequences with a lethality rate of ≥20% or with experimentation-induced phenotypic changes and processed these data to correspond to specific silence efficiency. Based on the first complied novel benchmark dataset, we specifically designed a deep neural network to identify and characterize efficient dsRNA for RNAi in insects. The dna2vec word embedding model was trained to extract distributed feature representations, and three powerful modules, namely convolutional neural network, bidirectional long short-term memory network, and self-attention mechanism, were integrated to form our predictor model to characterize the extracted dsRNAs and their silencing efficiencies for T. castaneum. Our model dsRNAPredictor showed reliable performance in multiple independent tests based on different species, including both T. castaneum and Aedes aegypti. This indicates that dsRNAPredictor can facilitate prescreening for designing high-efficiency dsRNA targeting target genes of insects in advance.

RNA 干扰（RNAi）技术被广泛应用于陆生昆虫的生物防治。在昆虫中应用 RNAi 的主要因素之一是 RNAi 效率的差异，不仅不同昆虫的 RNAi 效率可能不同，同一昆虫的不同基因，甚至同一基因的不同双链 RNA（dsRNA）的 RNAi 效率也可能不同。这项工作的重点是最后一个问题，并建立了一个生物信息学软件，可以帮助研究人员筛选出靶向目标基因最有效的dsRNA。众所周知，在昆虫中，红粉甲虫（Tribolium castaneum）是对 RNAi 最敏感的昆虫之一。我们从 iBeetle-Base 中提取了 12 027 个致死率≥20% 或具有实验诱导表型变化的高效 dsRNA 序列，并对这些数据进行了处理，以对应特定的沉默效率。基于首次编制的新型基准数据集，我们专门设计了一个深度神经网络，用于识别和表征昆虫 RNAi 的高效 dsRNA。我们训练了 dna2vec 字嵌入模型来提取分布式特征表征，并整合了三个强大的模块，即卷积神经网络、双向长短期记忆网络和自我注意机制，形成了我们的预测模型，以表征提取的 dsRNA 及其对 T. castaneum 的沉默效率。我们的dsRNAPredictor模型在多个基于不同物种的独立测试中表现出了可靠的性能，包括T. castaneum和埃及伊蚊。这表明 dsRNAPredictor 可以帮助预先筛选出高效的针对昆虫靶基因的 dsRNA。

{"title":"Characterization of double-stranded RNA and its silencing efficiency for insects using hybrid deep-learning framework.","authors":"Han Cheng, Liping Xu, Cangzhi Jia","doi":"10.1093/bfgp/elae027","DOIUrl":"10.1093/bfgp/elae027","url":null,"abstract":"RNA interference (RNAi) technology is widely used in the biological prevention and control of terrestrial insects. One of the main factors with the application of RNAi in insects is the difference in RNAi efficiency, which may vary not only in different insects, but also in different genes of the same insect, and even in different double-stranded RNAs (dsRNAs) of the same gene. This work focuses on the last question and establishes a bioinformatics software that can help researchers screen for the most efficient dsRNA targeting target genes. Among insects, the red flour beetle (Tribolium castaneum) is known to be one of the most sensitive to RNAi. From iBeetle-Base, we extracted 12 027 efficient dsRNA sequences with a lethality rate of ≥20% or with experimentation-induced phenotypic changes and processed these data to correspond to specific silence efficiency. Based on the first complied novel benchmark dataset, we specifically designed a deep neural network to identify and characterize efficient dsRNA for RNAi in insects. The dna2vec word embedding model was trained to extract distributed feature representations, and three powerful modules, namely convolutional neural network, bidirectional long short-term memory network, and self-attention mechanism, were integrated to form our predictor model to characterize the extracted dsRNAs and their silencing efficiencies for T. castaneum. Our model dsRNAPredictor showed reliable performance in multiple independent tests based on different species, including both T. castaneum and Aedes aegypti. This indicates that dsRNAPredictor can facilitate prescreening for designing high-efficiency dsRNA targeting target genes of insects in advance.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"858-865"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141443790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sesame Genomic Web Resource (SesameGWR): a well-annotated data resource for transcriptomic signatures of abiotic and biotic stress responses in sesame (Sesamum indicum L.). 芝麻基因组网络资源（SesameGWR）：芝麻（Sesamum indicum L.）非生物和生物胁迫反应转录组特征的完善注释数据资源。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae022

Himanshu Avashthi, Ulavappa Basavanneppa Angadi, Divya Chauhan, Anuj Kumar, Dwijesh Chandra Mishra, Parimalan Rangan, Rashmi Yadav, Dinesh Kumar

Sesame (Sesamum indicum L.) is a globally cultivated oilseed crop renowned for its historical significance and widespread growth in tropical and subtropical regions. With notable nutritional and medicinal attributes, sesame has shown promising effects in combating malnutrition cancer, diabetes, and other diseases like cardiovascular problems. However, sesame production faces significant challenges from environmental threats such as charcoal rot, drought, salinity, and waterlogging stress, resulting in economic losses for farmers. The scarcity of information on stress-resistance genes and pathways exacerbates these challenges. Despite its immense importance, there is currently no platform available to provide comprehensive information on sesame, which significantly hinders the mining of various stress-associated genes and the molecular breeding of sesame. To address this gap, here a free, web-accessible, and user-friendly genomic web resource (SesameGWR, http://backlin.cabgrid.res.in/sesameGWR/) has been developed This platform provides key insights into differentially expressed genes, transcription factors, miRNAs, and molecular markers like simple sequence repeats, single nucleotide polymorphisms, and insertions and deletions associated with both biotic and abiotic stresses.. The functional genomics information and annotations embedded in this web resource were predicted through RNA-seq data analysis. Considering the impact of climate change and the nutritional and medicinal importance of sesame, this study is of utmost importance in understanding stress responses. SesameGWR will serve as a valuable tool for developing climate-resilient sesame varieties, thereby enhancing the productivity of this ancient oilseed crop.

芝麻（Sesamum indicum L.）是一种全球栽培的油籽作物，因其历史意义和在热带和亚热带地区的广泛生长而闻名于世。芝麻具有显著的营养和药用价值，在防治营养不良、癌症、糖尿病和其他疾病（如心血管问题）方面具有良好的效果。然而，芝麻生产面临着炭腐病、干旱、盐碱和涝害等环境威胁的巨大挑战，给农民造成了经济损失。抗逆基因和途径方面的信息匮乏加剧了这些挑战。尽管芝麻非常重要，但目前还没有一个平台可以提供有关芝麻的全面信息，这极大地阻碍了对各种胁迫相关基因的挖掘和芝麻的分子育种。为了填补这一空白，我们开发了一个免费的、可通过网络访问的、用户友好型基因组网络资源（SesameGWR, http://backlin.cabgrid.res.in/sesameGWR/）。该平台提供了与生物和非生物胁迫相关的差异表达基因、转录因子、miRNA以及简单序列重复、单核苷酸多态性、插入和缺失等分子标记的关键信息。该网络资源中嵌入的功能基因组学信息和注释是通过 RNA-seq 数据分析预测的。考虑到气候变化的影响以及芝麻在营养和药用方面的重要性，这项研究对于了解胁迫响应具有极其重要的意义。SesameGWR 将成为开发气候适应性芝麻品种的宝贵工具，从而提高这种古老油籽作物的产量。

{"title":"Sesame Genomic Web Resource (SesameGWR): a well-annotated data resource for transcriptomic signatures of abiotic and biotic stress responses in sesame (Sesamum indicum L.).","authors":"Himanshu Avashthi, Ulavappa Basavanneppa Angadi, Divya Chauhan, Anuj Kumar, Dwijesh Chandra Mishra, Parimalan Rangan, Rashmi Yadav, Dinesh Kumar","doi":"10.1093/bfgp/elae022","DOIUrl":"10.1093/bfgp/elae022","url":null,"abstract":"Sesame (Sesamum indicum L.) is a globally cultivated oilseed crop renowned for its historical significance and widespread growth in tropical and subtropical regions. With notable nutritional and medicinal attributes, sesame has shown promising effects in combating malnutrition cancer, diabetes, and other diseases like cardiovascular problems. However, sesame production faces significant challenges from environmental threats such as charcoal rot, drought, salinity, and waterlogging stress, resulting in economic losses for farmers. The scarcity of information on stress-resistance genes and pathways exacerbates these challenges. Despite its immense importance, there is currently no platform available to provide comprehensive information on sesame, which significantly hinders the mining of various stress-associated genes and the molecular breeding of sesame. To address this gap, here a free, web-accessible, and user-friendly genomic web resource (SesameGWR, http://backlin.cabgrid.res.in/sesameGWR/) has been developed This platform provides key insights into differentially expressed genes, transcription factors, miRNAs, and molecular markers like simple sequence repeats, single nucleotide polymorphisms, and insertions and deletions associated with both biotic and abiotic stresses.. The functional genomics information and annotations embedded in this web resource were predicted through RNA-seq data analysis. Considering the impact of climate change and the nutritional and medicinal importance of sesame, this study is of utmost importance in understanding stress responses. SesameGWR will serve as a valuable tool for developing climate-resilient sesame varieties, thereby enhancing the productivity of this ancient oilseed crop.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"828-842"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141238358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The frontier of precision medicine: application of single-cell multi-omics in preimplantation genetic diagnosis. 精准医疗的前沿：单细胞多组学在植入前遗传学诊断中的应用。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics

Pub Date : 2024-12-06 DOI: 10.1093/bfgp/elae041

Jinglei Zhang, Nan Zhang, Qingyun Mai, Canquan Zhou

The advent of single-cell multi-omics technologies has revolutionized the landscape of preimplantation genetic diagnosis (PGD), offering unprecedented insights into the genetic, transcriptomic, and proteomic profiles of individual cells in early-stage embryos. This breakthrough holds the promise of enhancing the accuracy, efficiency, and scope of PGD, thereby significantly improving outcomes in assisted reproductive technologies (ARTs) and genetic disease prevention. This review provides a comprehensive overview of the importance of PGD in the context of precision medicine and elucidates how single-cell multi-omics technologies have transformed this field. We begin with a brief history of PGD, highlighting its evolution and application in detecting genetic disorders and facilitating ART. Subsequently, we delve into the principles, methodologies, and applications of single-cell genomics, transcriptomics, and proteomics in PGD, emphasizing their role in improving diagnostic precision and efficiency. Furthermore, we review significant recent advances within this domain, including key experimental designs, findings, and their implications for PGD practices. The advantages and limitations of these studies are analyzed to assess their potential impact on the future development of PGD technologies. Looking forward, we discuss the emerging research directions and challenges, focusing on technological advancements, new application areas, and strategies to overcome existing limitations. In conclusion, this review underscores the pivotal role of single-cell multi-omics in PGD, highlighting its potential to drive the progress of precision medicine and personalized treatment strategies, thereby marking a new era in reproductive genetics and healthcare.

单细胞多组学技术的出现彻底改变了胚胎植入前遗传学诊断（PGD）的面貌，为早期胚胎中单个细胞的遗传学、转录组学和蛋白质组学特征提供了前所未有的洞察力。这一突破有望提高胚胎植入前遗传学诊断的准确性、效率和范围，从而显著改善辅助生殖技术（ART）和遗传疾病预防的效果。本综述全面概述了精准医疗背景下 PGD 的重要性，并阐明了单细胞多组学技术如何改变了这一领域。我们首先简要介绍了 PGD 的历史，强调了它在检测遗传疾病和促进抗逆转录病毒疗法方面的演变和应用。随后，我们深入探讨了单细胞基因组学、转录组学和蛋白质组学在 PGD 中的原理、方法和应用，强调了它们在提高诊断精度和效率方面的作用。此外，我们还回顾了这一领域的最新重大进展，包括关键的实验设计、研究结果及其对 PGD 实践的影响。我们分析了这些研究的优势和局限性，以评估它们对 PGD 技术未来发展的潜在影响。展望未来，我们讨论了新出现的研究方向和挑战，重点关注技术进步、新的应用领域以及克服现有局限性的策略。总之，本综述强调了单细胞多组学在 PGD 中的关键作用，凸显了其推动精准医学和个性化治疗策略进步的潜力，从而标志着生殖遗传学和医疗保健进入了一个新时代。

{"title":"The frontier of precision medicine: application of single-cell multi-omics in preimplantation genetic diagnosis.","authors":"Jinglei Zhang, Nan Zhang, Qingyun Mai, Canquan Zhou","doi":"10.1093/bfgp/elae041","DOIUrl":"10.1093/bfgp/elae041","url":null,"abstract":"The advent of single-cell multi-omics technologies has revolutionized the landscape of preimplantation genetic diagnosis (PGD), offering unprecedented insights into the genetic, transcriptomic, and proteomic profiles of individual cells in early-stage embryos. This breakthrough holds the promise of enhancing the accuracy, efficiency, and scope of PGD, thereby significantly improving outcomes in assisted reproductive technologies (ARTs) and genetic disease prevention. This review provides a comprehensive overview of the importance of PGD in the context of precision medicine and elucidates how single-cell multi-omics technologies have transformed this field. We begin with a brief history of PGD, highlighting its evolution and application in detecting genetic disorders and facilitating ART. Subsequently, we delve into the principles, methodologies, and applications of single-cell genomics, transcriptomics, and proteomics in PGD, emphasizing their role in improving diagnostic precision and efficiency. Furthermore, we review significant recent advances within this domain, including key experimental designs, findings, and their implications for PGD practices. The advantages and limitations of these studies are analyzed to assess their potential impact on the future development of PGD technologies. Looking forward, we discuss the emerging research directions and challenges, focusing on technological advancements, new application areas, and strategies to overcome existing limitations. In conclusion, this review underscores the pivotal role of single-cell multi-omics in PGD, highlighting its potential to drive the progress of precision medicine and personalized treatment strategies, thereby marking a new era in reproductive genetics and healthcare.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"726-732"},"PeriodicalIF":2.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142565100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0