首页 > 最新文献

Briefings in Functional Genomics最新文献

英文 中文
DockingGA: enhancing targeted molecule generation using transformer neural network and genetic algorithm with docking simulation DockingGA:利用变压器神经网络和遗传算法进行对接模拟,提高靶向分子生成能力
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-04-07 DOI: 10.1093/bfgp/elae011
Changnan Gao, Wenjie Bao, Shuang Wang, Jianyang Zheng, Lulu Wang, Yongqi Ren, Linfang Jiao, Jianmin Wang, Xun Wang
Generative molecular models generate novel molecules with desired properties by searching chemical space. Traditional combinatorial optimization methods, such as genetic algorithms, have demonstrated superior performance in various molecular optimization tasks. However, these methods do not utilize docking simulation to inform the design process, and heavy dependence on the quality and quantity of available data, as well as require additional structural optimization to become candidate drugs. To address this limitation, we propose a novel model named DockingGA that combines Transformer neural networks and genetic algorithms to generate molecules with better binding affinity for specific targets. In order to generate high quality molecules, we chose the Self-referencing Chemical Structure Strings to represent the molecule and optimize the binding affinity of the molecules to different targets. Compared to other baseline models, DockingGA proves to be the optimal model in all docking results for the top 1, 10 and 100 molecules, while maintaining 100% novelty. Furthermore, the distribution of physicochemical properties demonstrates the ability of DockingGA to generate molecules with favorable and appropriate properties. This innovation creates new opportunities for the application of generative models in practical drug discovery.
生成分子模型通过搜索化学空间生成具有所需特性的新型分子。传统的组合优化方法(如遗传算法)在各种分子优化任务中表现出卓越的性能。然而,这些方法并不利用对接模拟为设计过程提供信息,而且严重依赖现有数据的质量和数量,还需要额外的结构优化才能成为候选药物。针对这一局限性,我们提出了一种名为 DockingGA 的新型模型,该模型结合了 Transformer 神经网络和遗传算法,可生成对特定靶点具有更好结合亲和力的分子。为了生成高质量的分子,我们选择了自参照化学结构字符串来表示分子,并优化分子与不同靶点的结合亲和力。与其他基线模型相比,DockingGA 被证明是所有对接结果中排名前 1、10 和 100 位分子的最佳模型,同时保持了 100% 的新颖性。此外,理化性质的分布也证明了 DockingGA 生成具有有利和适当性质的分子的能力。这一创新为生成模型在实际药物发现中的应用创造了新的机遇。
{"title":"DockingGA: enhancing targeted molecule generation using transformer neural network and genetic algorithm with docking simulation","authors":"Changnan Gao, Wenjie Bao, Shuang Wang, Jianyang Zheng, Lulu Wang, Yongqi Ren, Linfang Jiao, Jianmin Wang, Xun Wang","doi":"10.1093/bfgp/elae011","DOIUrl":"https://doi.org/10.1093/bfgp/elae011","url":null,"abstract":"Generative molecular models generate novel molecules with desired properties by searching chemical space. Traditional combinatorial optimization methods, such as genetic algorithms, have demonstrated superior performance in various molecular optimization tasks. However, these methods do not utilize docking simulation to inform the design process, and heavy dependence on the quality and quantity of available data, as well as require additional structural optimization to become candidate drugs. To address this limitation, we propose a novel model named DockingGA that combines Transformer neural networks and genetic algorithms to generate molecules with better binding affinity for specific targets. In order to generate high quality molecules, we chose the Self-referencing Chemical Structure Strings to represent the molecule and optimize the binding affinity of the molecules to different targets. Compared to other baseline models, DockingGA proves to be the optimal model in all docking results for the top 1, 10 and 100 molecules, while maintaining 100% novelty. Furthermore, the distribution of physicochemical properties demonstrates the ability of DockingGA to generate molecules with favorable and appropriate properties. This innovation creates new opportunities for the application of generative models in practical drug discovery.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"69 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140592804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs 基于深度学习的长非编码 RNA 相互作用机制识别与预测综述
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-04-05 DOI: 10.1093/bfgp/elae010
Biyu Diao, Jin Luo, Yu Guo
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body’s normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
随着测序技术和基因组学研究的发展,人们发现长非编码 RNA(lncRNA)广泛参与真核生物的表观遗传、转录和转录后调控过程。因此,它们在人体的正常生理和各种疾病结果中发挥着至关重要的作用。目前,大量未知的 lncRNA 测序数据需要探索。随着人工智能时代的到来,建立基于深度学习的 lncRNA 预测模型为研究人员提供了宝贵的见解,大大减少了与试验和错误相关的时间和成本,促进了疾病相关 lncRNA 的鉴定,以便进行预后分析和靶向药物开发。然而,大多数lncRNA相关研究人员对深度学习模型和模型选择的最新进展以及在lncRNA功能研究中的应用缺乏认识。因此,我们阐释了深度学习模型的概念,探讨了几种流行的深度学习算法及其数据偏好,结合不同的预测功能,全面回顾了过去5年中具有典范预测性能的最新文献研究,批判性地分析和讨论了当前深度学习模型和解决方案的优点和局限性,同时也基于lncRNA研究的前沿进展提出了展望。
{"title":"A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs","authors":"Biyu Diao, Jin Luo, Yu Guo","doi":"10.1093/bfgp/elae010","DOIUrl":"https://doi.org/10.1093/bfgp/elae010","url":null,"abstract":"Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body’s normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"69 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140592722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An improved hierarchical variational autoencoder for cell-cell communication estimation using single-cell RNA-seq data. 利用单细胞 RNA-seq 数据估算细胞间通讯的改进型分层变异自动编码器。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elac056
Shuhui Liu, Yupei Zhang, Jiajie Peng, Xuequn Shang

Analysis of cell-cell communication (CCC) in the tumor micro-environment helps decipher the underlying mechanism of cancer progression and drug tolerance. Currently, single-cell RNA-Seq data are available on a large scale, providing an unprecedented opportunity to predict cellular communications. There have been many achievements and applications in inferring cell-cell communication based on the known interactions between molecules, such as ligands, receptors and extracellular matrix. However, the prior information is not quite adequate and only involves a fraction of cellular communications, producing many false-positive or false-negative results. To this end, we propose an improved hierarchical variational autoencoder (HiVAE) based model to fully use single-cell RNA-seq data for automatically estimating CCC. Specifically, the HiVAE model is used to learn the potential representation of cells on known ligand-receptor genes and all genes in single-cell RNA-seq data, respectively, which are then utilized for cascade integration. Subsequently, transfer entropy is employed to measure the transmission of information flow between two cells based on the learned representations, which are regarded as directed communication relationships. Experiments are conducted on single-cell RNA-seq data of the human skin disease dataset and the melanoma dataset, respectively. Results show that the HiVAE model is effective in learning cell representations, and transfer entropy could be used to estimate the communication scores between cell types.

分析肿瘤微环境中的细胞-细胞通讯(CCC)有助于破译癌症进展和药物耐受性的内在机制。目前,单细胞 RNA-Seq 数据已大规模可用,为预测细胞通讯提供了前所未有的机会。根据配体、受体和细胞外基质等分子间已知的相互作用来推断细胞间的通讯,已经取得了许多成就并得到了广泛应用。然而,先验信息并不十分充分,而且只涉及细胞通讯的一部分,会产生许多假阳性或假阴性结果。为此,我们提出了一种基于分层变异自动编码器(HiVAE)的改进模型,以充分利用单细胞 RNA-seq 数据自动估计 CCC。具体来说,HiVAE 模型分别用于学习细胞在已知配体受体基因和单细胞 RNA-seq 数据中所有基因上的潜在表示,然后利用这些基因进行级联整合。随后,利用转移熵来测量两个细胞之间基于所学表征的信息流传输,并将其视为定向通信关系。实验分别在人类皮肤病数据集和黑色素瘤数据集的单细胞 RNA-seq 数据上进行。结果表明,HiVAE 模型能有效地学习细胞表征,转移熵可用于估计细胞类型之间的通信分数。
{"title":"An improved hierarchical variational autoencoder for cell-cell communication estimation using single-cell RNA-seq data.","authors":"Shuhui Liu, Yupei Zhang, Jiajie Peng, Xuequn Shang","doi":"10.1093/bfgp/elac056","DOIUrl":"10.1093/bfgp/elac056","url":null,"abstract":"<p><p>Analysis of cell-cell communication (CCC) in the tumor micro-environment helps decipher the underlying mechanism of cancer progression and drug tolerance. Currently, single-cell RNA-Seq data are available on a large scale, providing an unprecedented opportunity to predict cellular communications. There have been many achievements and applications in inferring cell-cell communication based on the known interactions between molecules, such as ligands, receptors and extracellular matrix. However, the prior information is not quite adequate and only involves a fraction of cellular communications, producing many false-positive or false-negative results. To this end, we propose an improved hierarchical variational autoencoder (HiVAE) based model to fully use single-cell RNA-seq data for automatically estimating CCC. Specifically, the HiVAE model is used to learn the potential representation of cells on known ligand-receptor genes and all genes in single-cell RNA-seq data, respectively, which are then utilized for cascade integration. Subsequently, transfer entropy is employed to measure the transmission of information flow between two cells based on the learned representations, which are regarded as directed communication relationships. Experiments are conducted on single-cell RNA-seq data of the human skin disease dataset and the melanoma dataset, respectively. Results show that the HiVAE model is effective in learning cell representations, and transfer entropy could be used to estimate the communication scores between cell types.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"118-127"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9222533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-cell RNA-seq data clustering by deep information fusion. 通过深度信息融合对单细胞 RNA-seq 数据进行聚类。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad017
Liangrui Ren, Jun Wang, Wei Li, Maozu Guo, Guoxian Yu

Determining cell types by single-cell transcriptomics data is fundamental for downstream analysis. However, cell clustering and data imputation still face the computation challenges, due to the high dropout rate, sparsity and dimensionality of single-cell data. Although some deep learning based solutions have been proposed to handle these challenges, they still can not leverage gene attribute information and cell topology in a sensible way to explore the consistent clustering. In this paper, we present scDeepFC, a deep information fusion-based single-cell data clustering method for cell clustering and data imputation. Specifically, scDeepFC uses a deep auto-encoder (DAE) network and a deep graph convolution network to embed high-dimensional gene attribute information and high-order cell-cell topological information into different low-dimensional representations, and then fuses them to generate a more comprehensive and accurate consensus representation via a deep information fusion network. In addition, scDeepFC integrates the zero-inflated negative binomial (ZINB) into DAE to model the dropout events. By jointly optimizing the ZINB loss and cell graph reconstruction loss, scDeepFC generates a salient embedding representation for clustering cells and imputing missing data. Extensive experiments on real single-cell datasets prove that scDeepFC outperforms other popular single-cell analysis methods. Both the gene attribute and cell topology information can improve the cell clustering.

通过单细胞转录组学数据确定细胞类型是下游分析的基础。然而,由于单细胞数据的高丢失率、稀疏性和维度性,细胞聚类和数据估算仍面临计算挑战。虽然已经提出了一些基于深度学习的解决方案来应对这些挑战,但它们仍然无法以合理的方式利用基因属性信息和细胞拓扑结构来探索一致性聚类。本文提出了一种基于深度信息融合的单细胞数据聚类方法--scDeepFC,用于细胞聚类和数据估算。具体来说,scDeepFC 利用深度自动编码器(DAE)网络和深度图卷积网络将高维基因属性信息和高阶细胞-细胞拓扑信息嵌入到不同的低维表征中,然后通过深度信息融合网络将它们融合生成更全面、更准确的共识表征。此外,scDeepFC 还将零膨胀负二项式(ZINB)集成到 DAE 中,以模拟辍学事件。通过联合优化 ZINB 损失和细胞图重建损失,scDeepFC 生成了用于细胞聚类和缺失数据补充的突出嵌入表示。在真实单细胞数据集上进行的大量实验证明,scDeepFC优于其他流行的单细胞分析方法。基因属性和细胞拓扑信息都能改进细胞聚类。
{"title":"Single-cell RNA-seq data clustering by deep information fusion.","authors":"Liangrui Ren, Jun Wang, Wei Li, Maozu Guo, Guoxian Yu","doi":"10.1093/bfgp/elad017","DOIUrl":"10.1093/bfgp/elad017","url":null,"abstract":"<p><p>Determining cell types by single-cell transcriptomics data is fundamental for downstream analysis. However, cell clustering and data imputation still face the computation challenges, due to the high dropout rate, sparsity and dimensionality of single-cell data. Although some deep learning based solutions have been proposed to handle these challenges, they still can not leverage gene attribute information and cell topology in a sensible way to explore the consistent clustering. In this paper, we present scDeepFC, a deep information fusion-based single-cell data clustering method for cell clustering and data imputation. Specifically, scDeepFC uses a deep auto-encoder (DAE) network and a deep graph convolution network to embed high-dimensional gene attribute information and high-order cell-cell topological information into different low-dimensional representations, and then fuses them to generate a more comprehensive and accurate consensus representation via a deep information fusion network. In addition, scDeepFC integrates the zero-inflated negative binomial (ZINB) into DAE to model the dropout events. By jointly optimizing the ZINB loss and cell graph reconstruction loss, scDeepFC generates a salient embedding representation for clustering cells and imputing missing data. Extensive experiments on real single-cell datasets prove that scDeepFC outperforms other popular single-cell analysis methods. Both the gene attribute and cell topology information can improve the cell clustering.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"128-137"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9489133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies. 用于单细胞 RNA-seq 和空间解析转录组研究的差异表达分析的最新进展。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad011
Xiya Guo, Jin Ning, Yuanze Chen, Guoliang Liu, Liyan Zhao, Yue Fan, Shiquan Sun

Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.

差异表达(DE)分析是分析单细胞 RNA 测序(scRNA-seq)和空间分辨转录组学(SRT)数据的必要步骤。与传统的大容量 RNA-seq 不同,scRNA-seq 或 SRT 数据的差异表达分析具有独特的特点,可能导致难以检测到差异表达基因。然而,由于有大量的 DE 工具可在各种假设条件下工作,因此很难选择合适的工具。此外,关于从多条件、多样本实验设计中检测scRNA-seq数据或SRT数据中的DE基因,目前还缺乏全面的综述。为了弥补这一空白,我们在此首先关注 DE 检测所面临的挑战,然后强调促进 scRNA-seq 或 SRT 分析进一步发展的潜在机遇,最后为选择合适的 DE 工具或开发新的计算 DE 方法提供见解和指导。
{"title":"Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies.","authors":"Xiya Guo, Jin Ning, Yuanze Chen, Guoliang Liu, Liyan Zhao, Yue Fan, Shiquan Sun","doi":"10.1093/bfgp/elad011","DOIUrl":"10.1093/bfgp/elad011","url":null,"abstract":"<p><p>Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"95-109"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9258877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating functional scoring and regulatory data to predict the effect of non-coding SNPs in a complex neurological disease. 整合功能评分和调控数据,预测非编码 SNPs 对一种复杂神经疾病的影响。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad020
Daniela Felício, Miguel Alves-Ferreira, Mariana Santos, Marlene Quintas, Alexandra M Lopes, Carolina Lemos, Nádia Pinto, Sandra Martins

Most SNPs associated with complex diseases seem to lie in non-coding regions of the genome; however, their contribution to gene expression and disease phenotype remains poorly understood. Here, we established a workflow to provide assistance in prioritising the functional relevance of non-coding SNPs of candidate genes as susceptibility loci in polygenic neurological disorders. To illustrate the applicability of our workflow, we considered the multifactorial disorder migraine as a model to follow our step-by-step approach. We annotated the overlap of selected SNPs with regulatory elements and assessed their potential impact on gene expression based on publicly available prediction algorithms and functional genomics information. Some migraine risk loci have been hypothesised to reside in non-coding regions and to be implicated in the neurotransmission pathway. In this study, we used a set of 22 non-coding SNPs from neurotransmission and synaptic machinery-related genes previously suggested to be involved in migraine susceptibility based on our candidate gene association studies. After prioritising these SNPs, we focused on non-reported ones that demonstrated high regulatory potential: (1) VAMP2_rs1150 (3' UTR) was predicted as a target of hsa-mir-5010-3p miRNA, possibly disrupting its own gene expression; (2) STX1A_rs6951030 (proximal enhancer) may affect the binding affinity of zinc-finger transcription factors (namely ZNF423) and disturb TBL2 gene expression; and (3) SNAP25_rs2327264 (distal enhancer) expected to be in a binding site of ONECUT2 transcription factor. This study demonstrated the applicability of our practical workflow to facilitate the prioritisation of potentially relevant non-coding SNPs and predict their functional impact in multifactorial neurological diseases.

与复杂疾病相关的大多数 SNP 似乎都位于基因组的非编码区;然而,人们对这些 SNP 对基因表达和疾病表型的贡献仍然知之甚少。在此,我们建立了一个工作流程,以帮助确定候选基因的非编码 SNPs 作为多基因神经系统疾病易感位点的功能相关性。为了说明工作流程的适用性,我们将多因素疾病偏头痛作为一个模型,按照我们的方法逐步进行研究。我们注释了所选 SNP 与调控元件的重叠,并根据公开可用的预测算法和功能基因组学信息评估了它们对基因表达的潜在影响。一些偏头痛风险基因位点被假定位于非编码区,并与神经传递途径有关。在本研究中,我们使用了22个非编码SNPs,这些SNPs来自神经传递和突触机械相关基因,之前根据候选基因关联研究,这些基因被认为与偏头痛易感性有关。在对这些 SNP 进行优先排序后,我们重点研究了那些未报告的、具有高调控潜力的 SNP:(1)VAMP2_rs1150(3' UTR)被预测为 hsa-mir-5010-3p miRNA 的靶点,可能会干扰其自身基因的表达;(2)STX1A_rs6951030(近端增强子)可能会影响锌指转录因子(即 ZNF423)的结合亲和力,干扰 TBL2 基因的表达;(3)SNAP25_rs2327264(远端增强子)预计位于 ONECUT2 转录因子的结合位点。这项研究证明了我们的实用工作流程的适用性,它有助于对潜在相关的非编码 SNP 进行优先排序,并预测它们在多因素神经系统疾病中的功能影响。
{"title":"Integrating functional scoring and regulatory data to predict the effect of non-coding SNPs in a complex neurological disease.","authors":"Daniela Felício, Miguel Alves-Ferreira, Mariana Santos, Marlene Quintas, Alexandra M Lopes, Carolina Lemos, Nádia Pinto, Sandra Martins","doi":"10.1093/bfgp/elad020","DOIUrl":"10.1093/bfgp/elad020","url":null,"abstract":"<p><p>Most SNPs associated with complex diseases seem to lie in non-coding regions of the genome; however, their contribution to gene expression and disease phenotype remains poorly understood. Here, we established a workflow to provide assistance in prioritising the functional relevance of non-coding SNPs of candidate genes as susceptibility loci in polygenic neurological disorders. To illustrate the applicability of our workflow, we considered the multifactorial disorder migraine as a model to follow our step-by-step approach. We annotated the overlap of selected SNPs with regulatory elements and assessed their potential impact on gene expression based on publicly available prediction algorithms and functional genomics information. Some migraine risk loci have been hypothesised to reside in non-coding regions and to be implicated in the neurotransmission pathway. In this study, we used a set of 22 non-coding SNPs from neurotransmission and synaptic machinery-related genes previously suggested to be involved in migraine susceptibility based on our candidate gene association studies. After prioritising these SNPs, we focused on non-reported ones that demonstrated high regulatory potential: (1) VAMP2_rs1150 (3' UTR) was predicted as a target of hsa-mir-5010-3p miRNA, possibly disrupting its own gene expression; (2) STX1A_rs6951030 (proximal enhancer) may affect the binding affinity of zinc-finger transcription factors (namely ZNF423) and disturb TBL2 gene expression; and (3) SNAP25_rs2327264 (distal enhancer) expected to be in a binding site of ONECUT2 transcription factor. This study demonstrated the applicability of our practical workflow to facilitate the prioritisation of potentially relevant non-coding SNPs and predict their functional impact in multifactorial neurological diseases.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"138-149"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9918600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional characteristics of DNA N6-methyladenine modification based on long-read sequencing in pancreatic cancer. 基于长线程测序的胰腺癌 DNA N6-甲基腺嘌呤修饰的功能特征
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad021
Dianshuang Zhou, Shiwei Guo, Yangyang Wang, Jiyun Zhao, Honghao Liu, Feiyang Zhou, Yan Huang, Yue Gu, Gang Jin, Yan Zhang

Abnormalities of DNA modifications are closely related to the pathogenesis and prognosis of pancreatic cancer. The development of third-generation sequencing technology has brought opportunities for the study of new epigenetic modification in cancer. Here, we screened the N6-methyladenine (6mA) and 5-methylcytosine (5mC) modification in pancreatic cancer based on Oxford Nanopore Technologies sequencing. The 6mA levels were lower compared with 5mC and upregulated in pancreatic cancer. We developed a novel method to define differentially methylated deficient region (DMDR), which overlapped 1319 protein-coding genes in pancreatic cancer. Genes screened by DMDRs were more significantly enriched in the cancer genes compared with the traditional differential methylation method (P < 0.001 versus P = 0.21, hypergeometric test). We then identified a survival-related signature based on DMDRs (DMDRSig) that stratified patients into high- and low-risk groups. Functional enrichment analysis indicated that 891 genes were closely related to alternative splicing. Multi-omics data from the cancer genome atlas showed that these genes were frequently altered in cancer samples. Survival analysis indicated that seven genes with high expression (ADAM9, ADAM10, EPS8, FAM83A, FAM111B, LAMA3 and TES) were significantly associated with poor prognosis. In addition, the distinction for pancreatic cancer subtypes was determined using 46 subtype-specific genes and unsupervised clustering. Overall, our study is the first to explore the molecular characteristics of 6mA modifications in pancreatic cancer, indicating that 6mA has the potential to be a target for future clinical treatment.

DNA 修饰异常与胰腺癌的发病机制和预后密切相关。第三代测序技术的发展为研究癌症中新的表观遗传修饰带来了机遇。在此,我们基于牛津纳米孔技术测序筛选了胰腺癌中的N6-甲基腺嘌呤(6mA)和5-甲基胞嘧啶(5mC)修饰。与 5mC 相比,6mA 水平较低,并且在胰腺癌中上调。我们开发了一种界定差异甲基化缺陷区(DMDR)的新方法,该方法与胰腺癌中的 1319 个蛋白编码基因重叠。与传统的差异甲基化方法相比,通过DMDR筛选出的基因在癌症基因中的富集程度更高(P
{"title":"Functional characteristics of DNA N6-methyladenine modification based on long-read sequencing in pancreatic cancer.","authors":"Dianshuang Zhou, Shiwei Guo, Yangyang Wang, Jiyun Zhao, Honghao Liu, Feiyang Zhou, Yan Huang, Yue Gu, Gang Jin, Yan Zhang","doi":"10.1093/bfgp/elad021","DOIUrl":"10.1093/bfgp/elad021","url":null,"abstract":"<p><p>Abnormalities of DNA modifications are closely related to the pathogenesis and prognosis of pancreatic cancer. The development of third-generation sequencing technology has brought opportunities for the study of new epigenetic modification in cancer. Here, we screened the N6-methyladenine (6mA) and 5-methylcytosine (5mC) modification in pancreatic cancer based on Oxford Nanopore Technologies sequencing. The 6mA levels were lower compared with 5mC and upregulated in pancreatic cancer. We developed a novel method to define differentially methylated deficient region (DMDR), which overlapped 1319 protein-coding genes in pancreatic cancer. Genes screened by DMDRs were more significantly enriched in the cancer genes compared with the traditional differential methylation method (P < 0.001 versus P = 0.21, hypergeometric test). We then identified a survival-related signature based on DMDRs (DMDRSig) that stratified patients into high- and low-risk groups. Functional enrichment analysis indicated that 891 genes were closely related to alternative splicing. Multi-omics data from the cancer genome atlas showed that these genes were frequently altered in cancer samples. Survival analysis indicated that seven genes with high expression (ADAM9, ADAM10, EPS8, FAM83A, FAM111B, LAMA3 and TES) were significantly associated with poor prognosis. In addition, the distinction for pancreatic cancer subtypes was determined using 46 subtype-specific genes and unsupervised clustering. Overall, our study is the first to explore the molecular characteristics of 6mA modifications in pancreatic cancer, indicating that 6mA has the potential to be a target for future clinical treatment.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"150-162"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9588453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying transcriptome diversity: a review. 量化转录组多样性:综述。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad019
Emma F Jones, Anisha Haldar, Vishal H Oza, Brittany N Lasseigne

Following the central dogma of molecular biology, gene expression heterogeneity can aid in predicting and explaining the wide variety of protein products, functions and, ultimately, heterogeneity in phenotypes. There is currently overlapping terminology used to describe the types of diversity in gene expression profiles, and overlooking these nuances can misrepresent important biological information. Here, we describe transcriptome diversity as a measure of the heterogeneity in (1) the expression of all genes within a sample or a single gene across samples in a population (gene-level diversity) or (2) the isoform-specific expression of a given gene (isoform-level diversity). We first overview modulators and quantification of transcriptome diversity at the gene level. Then, we discuss the role alternative splicing plays in driving transcript isoform-level diversity and how it can be quantified. Additionally, we overview computational resources for calculating gene-level and isoform-level diversity for high-throughput sequencing data. Finally, we discuss future applications of transcriptome diversity. This review provides a comprehensive overview of how gene expression diversity arises, and how measuring it determines a more complete picture of heterogeneity across proteins, cells, tissues, organisms and species.

根据分子生物学的核心教条,基因表达异质性有助于预测和解释各种蛋白质产物、功能以及最终的表型异质性。目前用于描述基因表达谱多样性类型的术语存在重叠,忽略这些细微差别可能会错误地反映重要的生物学信息。在此,我们将转录组多样性描述为衡量以下方面异质性的一种方法:(1)一个样本中所有基因的表达,或一个群体中不同样本中单个基因的表达(基因水平多样性),或(2)给定基因的同工酶特异性表达(同工酶水平多样性)。我们首先概述了基因水平转录组多样性的调节因子和量化方法。然后,我们将讨论替代剪接在推动转录本同工酶水平多样性方面所起的作用,以及如何对其进行量化。此外,我们还概述了计算高通量测序数据的基因水平和同工酶水平多样性的计算资源。最后,我们讨论了转录组多样性的未来应用。本综述全面概述了基因表达多样性是如何产生的,以及如何通过测量基因表达多样性来更全面地了解蛋白质、细胞、组织、生物体和物种之间的异质性。
{"title":"Quantifying transcriptome diversity: a review.","authors":"Emma F Jones, Anisha Haldar, Vishal H Oza, Brittany N Lasseigne","doi":"10.1093/bfgp/elad019","DOIUrl":"10.1093/bfgp/elad019","url":null,"abstract":"<p><p>Following the central dogma of molecular biology, gene expression heterogeneity can aid in predicting and explaining the wide variety of protein products, functions and, ultimately, heterogeneity in phenotypes. There is currently overlapping terminology used to describe the types of diversity in gene expression profiles, and overlooking these nuances can misrepresent important biological information. Here, we describe transcriptome diversity as a measure of the heterogeneity in (1) the expression of all genes within a sample or a single gene across samples in a population (gene-level diversity) or (2) the isoform-specific expression of a given gene (isoform-level diversity). We first overview modulators and quantification of transcriptome diversity at the gene level. Then, we discuss the role alternative splicing plays in driving transcript isoform-level diversity and how it can be quantified. Additionally, we overview computational resources for calculating gene-level and isoform-level diversity for high-throughput sequencing data. Finally, we discuss future applications of transcriptome diversity. This review provides a comprehensive overview of how gene expression diversity arises, and how measuring it determines a more complete picture of heterogeneity across proteins, cells, tissues, organisms and species.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"83-94"},"PeriodicalIF":2.5,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11484519/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10195229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NTpred: a robust and precise machine learning framework for in silico identification of Tyrosine nitration sites in protein sequences. NTpred:用于蛋白质序列中酪氨酸硝化位点硅学鉴定的强大而精确的机器学习框架。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad018
Sourajyoti Datta, Muhammad Nabeel Asim, Andreas Dengel, Sheraz Ahmed

Post-translational modifications (PTMs) either enhance a protein's activity in various sub-cellular processes, or degrade their activity which leads toward failure of intracellular processes. Tyrosine nitration (NT) modification degrades protein's activity that initiates and propagates various diseases including neurodegenerative, cardiovascular, autoimmune diseases and carcinogenesis. Identification of NT modification supports development of novel therapies and drug discoveries for associated diseases. Identification of NT modification in biochemical labs is expensive, time consuming and error-prone. To supplement this process, several computational approaches have been proposed. However these approaches fail to precisely identify NT modification, due to the extraction of irrelevant, redundant and less discriminative features from protein sequences. This paper presents the NTpred framework that is competent in extracting comprehensive features from raw protein sequences using four different sequence encoders. To reap the benefits of different encoders, it generates four additional feature spaces by fusing different combinations of individual encodings. Furthermore, it eradicates irrelevant and redundant features from eight different feature spaces through a Recursive Feature Elimination process. Selected features of four individual encodings and four feature fusion vectors are used to train eight different Gradient Boosted Tree classifiers. The probability scores from the trained classifiers are utilized to generate a new probabilistic feature space, which is used to train a Logistic Regression classifier. On the BD1 benchmark dataset, the proposed framework outperforms the existing best-performing predictor in 5-fold cross validation and independent test evaluation with combined improvement of 13.7% in MCC and 20.1% in AUC. Similarly, on the BD2 benchmark dataset, the proposed framework outperforms the existing best-performing predictor with combined improvement of 5.3% in MCC and 1.0% in AUC. NTpred is publicly available for further experimentation and predictive use at: https://sds_genetic_analysis.opendfki.de/PredNTS/.

翻译后修饰(PTMs)可以增强蛋白质在各种亚细胞过程中的活性,也可以降低其活性,导致细胞内过程失效。酪氨酸硝化(NT)修饰会降低蛋白质的活性,从而引发和传播各种疾病,包括神经退行性疾病、心血管疾病、自身免疫性疾病和致癌疾病。NT修饰的鉴定有助于针对相关疾病开发新型疗法和药物。在生化实验室鉴定 NT 修饰既昂贵又耗时,而且容易出错。为了补充这一过程,人们提出了几种计算方法。然而,由于从蛋白质序列中提取了不相关的、冗余的和辨别力较低的特征,这些方法无法精确地识别 NT 修饰。本文介绍了 NTpred 框架,该框架能利用四种不同的序列编码器从原始蛋白质序列中提取综合特征。为了充分利用不同编码器的优势,它通过融合不同的编码组合生成了四个额外的特征空间。此外,它还通过递归特征消除过程,从八个不同的特征空间中消除无关和冗余特征。从四个单独编码和四个特征融合向量中选取的特征用于训练八个不同的梯度提升树分类器。训练好的分类器的概率分数被用来生成新的概率特征空间,并用于训练逻辑回归分类器。在 BD1 基准数据集上,所提出的框架在 5 倍交叉验证和独立测试评估中的表现优于现有表现最好的预测器,MCC 和 AUC 分别提高了 13.7% 和 20.1%。同样,在 BD2 基准数据集上,拟议框架的 MCC 和 AUC 分别提高了 5.3% 和 1.0%,优于现有表现最佳的预测器。NTpred 可在以下网址公开获取,供进一步实验和预测使用:https://sds_genetic_analysis.opendfki.de/PredNTS/。
{"title":"NTpred: a robust and precise machine learning framework for in silico identification of Tyrosine nitration sites in protein sequences.","authors":"Sourajyoti Datta, Muhammad Nabeel Asim, Andreas Dengel, Sheraz Ahmed","doi":"10.1093/bfgp/elad018","DOIUrl":"10.1093/bfgp/elad018","url":null,"abstract":"<p><p>Post-translational modifications (PTMs) either enhance a protein's activity in various sub-cellular processes, or degrade their activity which leads toward failure of intracellular processes. Tyrosine nitration (NT) modification degrades protein's activity that initiates and propagates various diseases including neurodegenerative, cardiovascular, autoimmune diseases and carcinogenesis. Identification of NT modification supports development of novel therapies and drug discoveries for associated diseases. Identification of NT modification in biochemical labs is expensive, time consuming and error-prone. To supplement this process, several computational approaches have been proposed. However these approaches fail to precisely identify NT modification, due to the extraction of irrelevant, redundant and less discriminative features from protein sequences. This paper presents the NTpred framework that is competent in extracting comprehensive features from raw protein sequences using four different sequence encoders. To reap the benefits of different encoders, it generates four additional feature spaces by fusing different combinations of individual encodings. Furthermore, it eradicates irrelevant and redundant features from eight different feature spaces through a Recursive Feature Elimination process. Selected features of four individual encodings and four feature fusion vectors are used to train eight different Gradient Boosted Tree classifiers. The probability scores from the trained classifiers are utilized to generate a new probabilistic feature space, which is used to train a Logistic Regression classifier. On the BD1 benchmark dataset, the proposed framework outperforms the existing best-performing predictor in 5-fold cross validation and independent test evaluation with combined improvement of 13.7% in MCC and 20.1% in AUC. Similarly, on the BD2 benchmark dataset, the proposed framework outperforms the existing best-performing predictor with combined improvement of 5.3% in MCC and 1.0% in AUC. NTpred is publicly available for further experimentation and predictive use at: https://sds_genetic_analysis.opendfki.de/PredNTS/.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"163-179"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9544857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating single-cell RNA sequencing data to genome-wide association analysis data identifies significant cell types in influenza A virus infection and COVID-19. 将单细胞 RNA 测序数据与全基因组关联分析数据相结合,确定了甲型流感病毒感染和 COVID-19 的重要细胞类型。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-03-20 DOI: 10.1093/bfgp/elad025
Yixin Zou, Xifang Sun, Yifan Wang, Yidi Wang, Xiangyu Ye, Junlan Tu, Rongbin Yu, Peng Huang

With the global pandemic of COVID-19, the research on influenza virus has entered a new stage, but it is difficult to elucidate the pathogenesis of influenza disease. Genome-wide association studies (GWASs) have greatly shed light on the role of host genetic background in influenza pathogenesis and prognosis, whereas single-cell RNA sequencing (scRNA-seq) has enabled unprecedented resolution of cellular diversity and in vivo following influenza disease. Here, we performed a comprehensive analysis of influenza GWAS and scRNA-seq data to reveal cell types associated with influenza disease and provide clues to understanding pathogenesis. We downloaded two GWAS summary data, two scRNA-seq data on influenza disease. After defining cell types for each scRNA-seq data, we used RolyPoly and LDSC-cts to integrate GWAS and scRNA-seq. Furthermore, we analyzed scRNA-seq data from the peripheral blood mononuclear cells (PBMCs) of a healthy population to validate and compare our results. After processing the scRNA-seq data, we obtained approximately 70 000 cells and identified up to 13 cell types. For the European population analysis, we determined an association between neutrophils and influenza disease. For the East Asian population analysis, we identified an association between monocytes and influenza disease. In addition, we also identified monocytes as a significantly related cell type in a dataset of healthy human PBMCs. In this comprehensive analysis, we identified neutrophils and monocytes as influenza disease-associated cell types. More attention and validation should be given in future studies.

随着 COVID-19 在全球的大流行,流感病毒的研究进入了一个新的阶段,但流感发病机制的阐明却困难重重。全基因组关联研究(GWAS)极大地揭示了宿主遗传背景在流感发病和预后中的作用,而单细胞 RNA 测序(scRNA-seq)则实现了对流感发病后细胞多样性和体内情况的前所未有的解析。在此,我们对流感 GWAS 和 scRNA-seq 数据进行了全面分析,以揭示与流感疾病相关的细胞类型,为了解发病机制提供线索。我们下载了两份 GWAS 总结数据和两份有关流感疾病的 scRNA-seq 数据。为每个 scRNA-seq 数据定义细胞类型后,我们使用 RolyPoly 和 LDSC-cts 整合了 GWAS 和 scRNA-seq 数据。此外,我们还分析了健康人群外周血单核细胞(PBMC)的 scRNA-seq 数据,以验证和比较我们的结果。在处理了 scRNA-seq 数据后,我们获得了约 7 万个细胞,并确定了多达 13 种细胞类型。在欧洲人群分析中,我们确定了中性粒细胞与流感疾病之间的关联。在东亚人群分析中,我们确定了单核细胞与流感疾病之间的关联。此外,我们还在一个健康人类 PBMC 数据集中发现单核细胞是一种明显相关的细胞类型。在这项综合分析中,我们发现中性粒细胞和单核细胞是与流感疾病相关的细胞类型。今后的研究应给予更多关注和验证。
{"title":"Integrating single-cell RNA sequencing data to genome-wide association analysis data identifies significant cell types in influenza A virus infection and COVID-19.","authors":"Yixin Zou, Xifang Sun, Yifan Wang, Yidi Wang, Xiangyu Ye, Junlan Tu, Rongbin Yu, Peng Huang","doi":"10.1093/bfgp/elad025","DOIUrl":"10.1093/bfgp/elad025","url":null,"abstract":"<p><p>With the global pandemic of COVID-19, the research on influenza virus has entered a new stage, but it is difficult to elucidate the pathogenesis of influenza disease. Genome-wide association studies (GWASs) have greatly shed light on the role of host genetic background in influenza pathogenesis and prognosis, whereas single-cell RNA sequencing (scRNA-seq) has enabled unprecedented resolution of cellular diversity and in vivo following influenza disease. Here, we performed a comprehensive analysis of influenza GWAS and scRNA-seq data to reveal cell types associated with influenza disease and provide clues to understanding pathogenesis. We downloaded two GWAS summary data, two scRNA-seq data on influenza disease. After defining cell types for each scRNA-seq data, we used RolyPoly and LDSC-cts to integrate GWAS and scRNA-seq. Furthermore, we analyzed scRNA-seq data from the peripheral blood mononuclear cells (PBMCs) of a healthy population to validate and compare our results. After processing the scRNA-seq data, we obtained approximately 70 000 cells and identified up to 13 cell types. For the European population analysis, we determined an association between neutrophils and influenza disease. For the East Asian population analysis, we identified an association between monocytes and influenza disease. In addition, we also identified monocytes as a significantly related cell type in a dataset of healthy human PBMCs. In this comprehensive analysis, we identified neutrophils and monocytes as influenza disease-associated cell types. More attention and validation should be given in future studies.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"110-117"},"PeriodicalIF":4.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9669193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Briefings in Functional Genomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1