首页 > 最新文献

Briefings in Functional Genomics最新文献

英文 中文
Omics-based deep learning approaches for lung cancer decision-making and therapeutics development. 基于 Omics 的深度学习方法用于肺癌决策和疗法开发。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad031
Thi-Oanh Tran, Thanh Hoa Vo, Nguyen Quoc Khanh Le

Lung cancer has been the most common and the leading cause of cancer deaths globally. Besides clinicopathological observations and traditional molecular tests, the advent of robust and scalable techniques for nucleic acid analysis has revolutionized biological research and medicinal practice in lung cancer treatment. In response to the demands for minimally invasive procedures and technology development over the past decade, many types of multi-omics data at various genome levels have been generated. As omics data grow, artificial intelligence models, particularly deep learning, are prominent in developing more rapid and effective methods to potentially improve lung cancer patient diagnosis, prognosis and treatment strategy. This decade has seen genome-based deep learning models thriving in various lung cancer tasks, including cancer prediction, subtype classification, prognosis estimation, cancer molecular signatures identification, treatment response prediction and biomarker development. In this study, we summarized available data sources for deep-learning-based lung cancer mining and provided an update on recent deep learning models in lung cancer genomics. Subsequently, we reviewed the current issues and discussed future research directions of deep-learning-based lung cancer genomics research.

肺癌是全球最常见的癌症,也是导致癌症死亡的主要原因。除了临床病理观察和传统的分子检测外,强大的、可扩展的核酸分析技术的出现彻底改变了肺癌治疗的生物学研究和医学实践。过去十年来,随着微创手术的需求和技术的发展,产生了许多不同基因组水平的多组学数据。随着 omics 数据的增长,人工智能模型,尤其是深度学习,在开发更快速有效的方法以改善肺癌患者的诊断、预后和治疗策略方面发挥了突出作用。这十年来,基于基因组的深度学习模型在各种肺癌任务中茁壮成长,包括癌症预测、亚型分类、预后评估、癌症分子特征识别、治疗反应预测和生物标记物开发。在本研究中,我们总结了基于深度学习的肺癌挖掘的可用数据源,并提供了肺癌基因组学中最新的深度学习模型。随后,我们回顾了当前的问题,并讨论了基于深度学习的肺癌基因组学研究的未来研究方向。
{"title":"Omics-based deep learning approaches for lung cancer decision-making and therapeutics development.","authors":"Thi-Oanh Tran, Thanh Hoa Vo, Nguyen Quoc Khanh Le","doi":"10.1093/bfgp/elad031","DOIUrl":"10.1093/bfgp/elad031","url":null,"abstract":"<p><p>Lung cancer has been the most common and the leading cause of cancer deaths globally. Besides clinicopathological observations and traditional molecular tests, the advent of robust and scalable techniques for nucleic acid analysis has revolutionized biological research and medicinal practice in lung cancer treatment. In response to the demands for minimally invasive procedures and technology development over the past decade, many types of multi-omics data at various genome levels have been generated. As omics data grow, artificial intelligence models, particularly deep learning, are prominent in developing more rapid and effective methods to potentially improve lung cancer patient diagnosis, prognosis and treatment strategy. This decade has seen genome-based deep learning models thriving in various lung cancer tasks, including cancer prediction, subtype classification, prognosis estimation, cancer molecular signatures identification, treatment response prediction and biomarker development. In this study, we summarized available data sources for deep-learning-based lung cancer mining and provided an update on recent deep learning models in lung cancer genomics. Subsequently, we reviewed the current issues and discussed future research directions of deep-learning-based lung cancer genomics research.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"181-192"},"PeriodicalIF":2.5,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10281428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Widespread transcriptomic alterations of transient receptor potential channel genes in cancer. 癌症中瞬时受体电位通道基因的广泛转录组变化。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad023
Tao Pan, Yueying Gao, Gang Xu, Lei Yu, Qi Xu, Jinyang Yu, Meng Liu, Can Zhang, Yanlin Ma, Yongsheng Li

Ion channels, in particular transient-receptor potential (TRP) channels, are essential genes that play important roles in many physiological processes. Emerging evidence has demonstrated that TRP genes are involved in a number of diseases, including various cancer types. However, we still lack knowledge about the expression alterations landscape of TRP genes across cancer types. In this review, we comprehensively reviewed and summarised the transcriptomes from more than 10 000 samples in 33 cancer types. We found that TRP genes were widespreadly transcriptomic dysregulated in cancer, which was associated with clinical survival of cancer patients. Perturbations of TRP genes were associated with a number of cancer pathways across cancer types. Moreover, we reviewed the functions of TRP family gene alterations in a number of diseases reported in recent studies. Taken together, our study comprehensively reviewed TRP genes with extensive transcriptomic alterations and their functions will directly contribute to cancer therapy and precision medicine.

离子通道,尤其是瞬态受体电位(TRP)通道,是在许多生理过程中发挥重要作用的基本基因。新的证据表明,TRP 基因与多种疾病(包括各种癌症)有关。然而,我们对不同癌症类型中 TRP 基因的表达改变情况仍然缺乏了解。在这篇综述中,我们全面回顾和总结了来自 33 种癌症类型 10,000 多个样本的转录组。我们发现,TRP基因在癌症中广泛存在转录组失调,这与癌症患者的临床生存率有关。TRP基因的干扰与不同癌症类型中的一些癌症通路有关。此外,我们还回顾了近期研究中报道的 TRP 家族基因改变在多种疾病中的功能。总之,我们的研究全面回顾了具有广泛转录组学改变的TRP基因及其功能,这将直接有助于癌症治疗和精准医疗。
{"title":"Widespread transcriptomic alterations of transient receptor potential channel genes in cancer.","authors":"Tao Pan, Yueying Gao, Gang Xu, Lei Yu, Qi Xu, Jinyang Yu, Meng Liu, Can Zhang, Yanlin Ma, Yongsheng Li","doi":"10.1093/bfgp/elad023","DOIUrl":"10.1093/bfgp/elad023","url":null,"abstract":"<p><p>Ion channels, in particular transient-receptor potential (TRP) channels, are essential genes that play important roles in many physiological processes. Emerging evidence has demonstrated that TRP genes are involved in a number of diseases, including various cancer types. However, we still lack knowledge about the expression alterations landscape of TRP genes across cancer types. In this review, we comprehensively reviewed and summarised the transcriptomes from more than 10 000 samples in 33 cancer types. We found that TRP genes were widespreadly transcriptomic dysregulated in cancer, which was associated with clinical survival of cancer patients. Perturbations of TRP genes were associated with a number of cancer pathways across cancer types. Moreover, we reviewed the functions of TRP family gene alterations in a number of diseases reported in recent studies. Taken together, our study comprehensively reviewed TRP genes with extensive transcriptomic alterations and their functions will directly contribute to cancer therapy and precision medicine.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"214-227"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9948253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping of long stretches of highly conserved sequences in over 6 million SARS-CoV-2 genomes. 在 600 多万个 SARS-CoV-2 基因组中绘制高度保守的长序列图。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad027
Akhil Kumar, Rishika Kaushal, Himanshi Sharma, Khushboo Sharma, Manoj B Menon, Vivekanandan P

We identified 11 conserved stretches in over 6.3 million SARS-CoV-2 genomes including all the major variants of concerns. Each conserved stretch is ≥100 nucleotides in length with ≥99.9% conservation at each nucleotide position. Interestingly, six of the eight conserved stretches in ORF1ab overlapped significantly with well-folded experimentally verified RNA secondary structures. Furthermore, two of the conserved stretches were mapped to regions within the S2-subunit that undergo dynamic structural rearrangements during viral fusion. In addition, the conserved stretches were significantly depleted for zinc-finger antiviral protein (ZAP) binding sites, which facilitated the recognition and degradation of viral RNA. These highly conserved stretches in the SARS-CoV-2 genome were poorly conserved at the nucleotide level among closely related β-coronaviruses, thus representing ideal targets for highly specific and discriminatory diagnostic assays. Our findings highlight the role of structural constraints at both RNA and protein levels that contribute to the sequence conservation of specific genomic regions in SARS-CoV-2.

我们在超过630万个SARS-CoV-2基因组中发现了11个保守区段,包括所有主要的关注变体。每个保守区段的长度≥100个核苷酸,每个核苷酸位置的保守性≥99.9%。有趣的是,ORF1ab 的 8 个保守片段中有 6 个与实验验证的折叠良好的 RNA 二级结构明显重叠。此外,其中两个保守片段被映射到了S2亚基中的区域,这些区域在病毒融合过程中会发生动态结构重排。此外,这些保守区段的锌指抗病毒蛋白(ZAP)结合位点明显减少,这有利于病毒 RNA 的识别和降解。SARS-CoV-2基因组中的这些高度保守区段在核苷酸水平上与近缘的β-冠状病毒保守性很低,因此是高度特异性和鉴别性诊断检测的理想目标。我们的研究结果突显了 RNA 和蛋白质水平上的结构限制对 SARS-CoV-2 基因组特定区域的序列保守性所起的作用。
{"title":"Mapping of long stretches of highly conserved sequences in over 6 million SARS-CoV-2 genomes.","authors":"Akhil Kumar, Rishika Kaushal, Himanshi Sharma, Khushboo Sharma, Manoj B Menon, Vivekanandan P","doi":"10.1093/bfgp/elad027","DOIUrl":"10.1093/bfgp/elad027","url":null,"abstract":"<p><p>We identified 11 conserved stretches in over 6.3 million SARS-CoV-2 genomes including all the major variants of concerns. Each conserved stretch is ≥100 nucleotides in length with ≥99.9% conservation at each nucleotide position. Interestingly, six of the eight conserved stretches in ORF1ab overlapped significantly with well-folded experimentally verified RNA secondary structures. Furthermore, two of the conserved stretches were mapped to regions within the S2-subunit that undergo dynamic structural rearrangements during viral fusion. In addition, the conserved stretches were significantly depleted for zinc-finger antiviral protein (ZAP) binding sites, which facilitated the recognition and degradation of viral RNA. These highly conserved stretches in the SARS-CoV-2 genome were poorly conserved at the nucleotide level among closely related β-coronaviruses, thus representing ideal targets for highly specific and discriminatory diagnostic assays. Our findings highlight the role of structural constraints at both RNA and protein levels that contribute to the sequence conservation of specific genomic regions in SARS-CoV-2.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"256-264"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9824704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integration of hybrid and self-correction method improves the quality of long-read sequencing data. 混合和自校正方法的整合提高了长读数测序数据的质量。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad026
Tao Tang, Yiping Liu, Binshuang Zheng, Rong Li, Xiaocai Zhang, Yuansheng Liu

Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.

过去十年间,第三代测序(TGS)技术给基因组科学带来了革命性的变化。然而,TGS 平台产生的长读数数据的错误率远高于之前的技术,从而使下游分析变得复杂。目前已开发出几种长读数数据纠错工具,可分为混合纠错工具和自我纠错工具。迄今为止,这两类工具是分开研究的,它们之间的相互作用仍未得到充分研究。在这里,我们整合了混合纠错和自我纠错方法,以实现高质量纠错。我们的程序利用了长读数数据与短读数高精度信息之间的相互相似性。我们在大肠杆菌和拟南芥数据集上比较了我们的方法和最先进的纠错工具的性能。结果表明,整合方法优于现有的纠错方法,有望提高基因组研究下游分析的质量。
{"title":"Integration of hybrid and self-correction method improves the quality of long-read sequencing data.","authors":"Tao Tang, Yiping Liu, Binshuang Zheng, Rong Li, Xiaocai Zhang, Yuansheng Liu","doi":"10.1093/bfgp/elad026","DOIUrl":"10.1093/bfgp/elad026","url":null,"abstract":"<p><p>Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"249-255"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9669190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepCMI: a graph-based model for accurate prediction of circRNA-miRNA interactions with multiple information. DeepCMI:基于图的模型,可准确预测具有多种信息的 circRNA-miRNA 相互作用。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad030
Yue-Chao Li, Zhu-Hong You, Chang-Qing Yu, Lei Wang, Lun Hu, Peng-Wei Hu, Yan Qiao, Xin-Fei Wang, Yu-An Huang

Recently, the role of competing endogenous RNAs in regulating gene expression through the interaction of microRNAs has been closely associated with the expression of circular RNAs (circRNAs) in various biological processes such as reproduction and apoptosis. While the number of confirmed circRNA-miRNA interactions (CMIs) continues to increase, the conventional in vitro approaches for discovery are expensive, labor intensive, and time consuming. Therefore, there is an urgent need for effective prediction of potential CMIs through appropriate data modeling and prediction based on known information. In this study, we proposed a novel model, called DeepCMI, that utilizes multi-source information on circRNA/miRNA to predict potential CMIs. Comprehensive evaluations on the CMI-9905 and CMI-9589 datasets demonstrated that DeepCMI successfully infers potential CMIs. Specifically, DeepCMI achieved AUC values of 90.54% and 94.8% on the CMI-9905 and CMI-9589 datasets, respectively. These results suggest that DeepCMI is an effective model for predicting potential CMIs and has the potential to significantly reduce the need for downstream in vitro studies. To facilitate the use of our trained model and data, we have constructed a computational platform, which is available at http://120.77.11.78/DeepCMI/. The source code and datasets used in this work are available at https://github.com/LiYuechao1998/DeepCMI.

最近,竞争性内源 RNA 通过 microRNAs 的相互作用在调节基因表达方面的作用与循环 RNAs(circRNAs)在繁殖和凋亡等各种生物过程中的表达密切相关。虽然已证实的环状 RNA-miRNA 相互作用(CMIs)的数量在不断增加,但传统的体外发现方法成本高、劳动强度大且耗时。因此,迫切需要通过适当的数据建模和基于已知信息的预测来有效预测潜在的 CMIs。在这项研究中,我们提出了一种名为 DeepCMI 的新型模型,该模型利用 circRNA/miRNA 的多源信息来预测潜在的 CMIs。在 CMI-9905 和 CMI-9589 数据集上进行的综合评估表明,DeepCMI 成功地推断出了潜在的 CMI。具体来说,DeepCMI 在 CMI-9905 和 CMI-9589 数据集上的 AUC 值分别达到了 90.54% 和 94.8%。这些结果表明,DeepCMI 是预测潜在 CMI 的有效模型,并有可能大大减少下游体外研究的需要。为了方便使用我们训练有素的模型和数据,我们构建了一个计算平台,可在 http://120.77.11.78/DeepCMI/ 上查阅。这项工作中使用的源代码和数据集可在 https://github.com/LiYuechao1998/DeepCMI 上获取。
{"title":"DeepCMI: a graph-based model for accurate prediction of circRNA-miRNA interactions with multiple information.","authors":"Yue-Chao Li, Zhu-Hong You, Chang-Qing Yu, Lei Wang, Lun Hu, Peng-Wei Hu, Yan Qiao, Xin-Fei Wang, Yu-An Huang","doi":"10.1093/bfgp/elad030","DOIUrl":"10.1093/bfgp/elad030","url":null,"abstract":"<p><p>Recently, the role of competing endogenous RNAs in regulating gene expression through the interaction of microRNAs has been closely associated with the expression of circular RNAs (circRNAs) in various biological processes such as reproduction and apoptosis. While the number of confirmed circRNA-miRNA interactions (CMIs) continues to increase, the conventional in vitro approaches for discovery are expensive, labor intensive, and time consuming. Therefore, there is an urgent need for effective prediction of potential CMIs through appropriate data modeling and prediction based on known information. In this study, we proposed a novel model, called DeepCMI, that utilizes multi-source information on circRNA/miRNA to predict potential CMIs. Comprehensive evaluations on the CMI-9905 and CMI-9589 datasets demonstrated that DeepCMI successfully infers potential CMIs. Specifically, DeepCMI achieved AUC values of 90.54% and 94.8% on the CMI-9905 and CMI-9589 datasets, respectively. These results suggest that DeepCMI is an effective model for predicting potential CMIs and has the potential to significantly reduce the need for downstream in vitro studies. To facilitate the use of our trained model and data, we have constructed a computational platform, which is available at http://120.77.11.78/DeepCMI/. The source code and datasets used in this work are available at https://github.com/LiYuechao1998/DeepCMI.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"276-285"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10291543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of strand-specific and cell-type-specific G-quadruplexes based on high-resolution CUT&Tag data. 基于高分辨率 CUT&Tag 数据预测链特异性和细胞类型特异性 G-四重链。
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad024
Yizhi Cui, Hongzhi Liu, Yutong Ming, Zheng Zhang, Li Liu, Ruijun Liu

G-quadruplex (G4), a non-classical deoxyribonucleic acid structure, is widely distributed in the genome and involved in various biological processes. In vivo, high-throughput sequencing has indicated that G4s are significantly enriched at functional regions in a cell-type-specific manner. Therefore, the prediction of G4s based on computational methods is necessary instead of the time-consuming and laborious experimental methods. Recently, G4 CUT&Tag has been developed to generate higher-resolution sequencing data than ChIP-seq, which provides more accurate training samples for model construction. In this paper, we present a new dataset construction method based on G4 CUT&Tag sequencing data and an XGBoost prediction model based on the machine learning boost method. The results show that our model performs well within and across cell types. Furthermore, sequence analysis indicates that the formation of G4 structure is greatly affected by the flanking sequences, and the GC content of the G4 flanking sequences is higher than non-G4. Moreover, we also identified G4 motifs in the high-resolution dataset, among which we found several motifs for known transcription factors (TFs), such as SP2 and BPC. These TFs may directly or indirectly affect the formation of the G4 structure.

G-四叠体(G4)是一种非典型脱氧核糖核酸结构,广泛分布于基因组中,参与各种生物过程。体内高通量测序表明,G4s 以细胞类型特异性的方式显著富集于功能区。因此,有必要基于计算方法预测 G4s,而不是费时费力的实验方法。最近开发的 G4 CUT&Tag 能生成比 ChIP-seq 更高分辨率的测序数据,从而为模型构建提供更准确的训练样本。本文提出了一种基于 G4 CUT&Tag 测序数据的新数据集构建方法和基于机器学习提升方法的 XGBoost 预测模型。结果表明,我们的模型在细胞类型内和细胞类型间都表现良好。此外,序列分析表明,G4 结构的形成在很大程度上受侧翼序列的影响,G4 侧翼序列的 GC 含量高于非 G4。此外,我们还在高分辨率数据集中发现了 G4 主题,其中我们发现了几个已知转录因子(TF)的主题,如 SP2 和 BPC。这些转录因子可能会直接或间接影响 G4 结构的形成。
{"title":"Prediction of strand-specific and cell-type-specific G-quadruplexes based on high-resolution CUT&Tag data.","authors":"Yizhi Cui, Hongzhi Liu, Yutong Ming, Zheng Zhang, Li Liu, Ruijun Liu","doi":"10.1093/bfgp/elad024","DOIUrl":"10.1093/bfgp/elad024","url":null,"abstract":"<p><p>G-quadruplex (G4), a non-classical deoxyribonucleic acid structure, is widely distributed in the genome and involved in various biological processes. In vivo, high-throughput sequencing has indicated that G4s are significantly enriched at functional regions in a cell-type-specific manner. Therefore, the prediction of G4s based on computational methods is necessary instead of the time-consuming and laborious experimental methods. Recently, G4 CUT&Tag has been developed to generate higher-resolution sequencing data than ChIP-seq, which provides more accurate training samples for model construction. In this paper, we present a new dataset construction method based on G4 CUT&Tag sequencing data and an XGBoost prediction model based on the machine learning boost method. The results show that our model performs well within and across cell types. Furthermore, sequence analysis indicates that the formation of G4 structure is greatly affected by the flanking sequences, and the GC content of the G4 flanking sequences is higher than non-G4. Moreover, we also identified G4 motifs in the high-resolution dataset, among which we found several motifs for known transcription factors (TFs), such as SP2 and BPC. These TFs may directly or indirectly affect the formation of the G4 structure.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"265-275"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9683854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prognostic signature analysis and survival prediction of esophageal cancer based on N6-methyladenosine associated lncRNAs. 基于N6-甲基腺苷相关lncRNA的食管癌预后特征分析和生存预测
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad028
Ting He, Zhipeng Gao, Ling Lin, Xu Zhang, Quan Zou

Esophageal cancer (ESCA) has a bad prognosis. Long non-coding RNA (lncRNA) impacts on cell proliferation. However, the prognosis function of N6-methyladenosine (m6A)-associated lncRNAs (m6A-lncRNAs) in ESCA remains unknown. Univariate Cox analysis was applied to investigate prognosis related m6A-lncRNAs, based on which the samples were clustered. Wilcoxon rank and Chi-square tests were adopted to compare the clinical traits, survival, pathway activity and immune infiltration in different clusters where overall survival, clinical traits (N stage), tumor-invasive immune cells and pathway activity were found significantly different. Through least absolute shrinkage and selection operator and proportional hazard (Lasso-Cox) model, five m6A-lncRNAs were selected to construct the prognostic signature (m6A-lncSig) and risk score. To investigate the link between risk score and clinical traits or immunological microenvironments, Chi-square test and Spearman correlation analysis were utilized. Risk score was found connected with N stage, tumor stage, different clusters, macrophages M2, B cells naive and T cells CD4 memory resting. Risk score and tumor stage were found as independent prognostic variables. And the constructed nomogram model had high accuracy in predicting prognosis. The obtained m6A-lncSig could be taken as potential prognostic biomarker for ESCA patients. This study offers a theoretical foundation for clinical diagnosis and prognosis of ESCA.

食管癌(ESCA)预后不良。长非编码 RNA(lncRNA)会影响细胞增殖。然而,N6-甲基腺苷(m6A)相关lncRNAs(m6A-lncRNAs)在ESCA中的预后功能仍然未知。研究人员应用单变量Cox分析来研究与预后相关的m6A-lncRNAs,并在此基础上对样本进行聚类。采用Wilcoxon秩和Chi-square检验比较不同聚类的临床特征、生存期、通路活性和免疫浸润,发现总体生存期、临床特征(N分期)、肿瘤侵袭性免疫细胞和通路活性存在显著差异。通过最小绝对缩减和选择算子以及比例危险(Lasso-Cox)模型,筛选出五个m6A-lncRNA构建了预后特征(m6A-lncSig)和风险评分。为了研究风险评分与临床特征或免疫学微环境之间的联系,采用了卡方检验和斯皮尔曼相关分析。结果发现,风险评分与N分期、肿瘤分期、不同集群、巨噬细胞M2、B细胞幼稚型和T细胞CD4记忆静息有关。风险评分和肿瘤分期被认为是独立的预后变量。所构建的提名图模型在预测预后方面具有很高的准确性。m6A-lncSig可作为ESCA患者潜在的预后生物标志物。该研究为ESCA的临床诊断和预后判断提供了理论依据。
{"title":"Prognostic signature analysis and survival prediction of esophageal cancer based on N6-methyladenosine associated lncRNAs.","authors":"Ting He, Zhipeng Gao, Ling Lin, Xu Zhang, Quan Zou","doi":"10.1093/bfgp/elad028","DOIUrl":"10.1093/bfgp/elad028","url":null,"abstract":"<p><p>Esophageal cancer (ESCA) has a bad prognosis. Long non-coding RNA (lncRNA) impacts on cell proliferation. However, the prognosis function of N6-methyladenosine (m6A)-associated lncRNAs (m6A-lncRNAs) in ESCA remains unknown. Univariate Cox analysis was applied to investigate prognosis related m6A-lncRNAs, based on which the samples were clustered. Wilcoxon rank and Chi-square tests were adopted to compare the clinical traits, survival, pathway activity and immune infiltration in different clusters where overall survival, clinical traits (N stage), tumor-invasive immune cells and pathway activity were found significantly different. Through least absolute shrinkage and selection operator and proportional hazard (Lasso-Cox) model, five m6A-lncRNAs were selected to construct the prognostic signature (m6A-lncSig) and risk score. To investigate the link between risk score and clinical traits or immunological microenvironments, Chi-square test and Spearman correlation analysis were utilized. Risk score was found connected with N stage, tumor stage, different clusters, macrophages M2, B cells naive and T cells CD4 memory resting. Risk score and tumor stage were found as independent prognostic variables. And the constructed nomogram model had high accuracy in predicting prognosis. The obtained m6A-lncSig could be taken as potential prognostic biomarker for ESCA patients. This study offers a theoretical foundation for clinical diagnosis and prognosis of ESCA.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"239-248"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9886829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of drug-protein interaction based on dual channel neural networks with attention mechanism. 基于注意机制的双通道神经网络预测药物与蛋白质的相互作用
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-15 DOI: 10.1093/bfgp/elad037
Dayu Tan, Haijun Jiang, Haitao Li, Ying Xie, Yansen Su

The precise identification of drug-protein inter action (DPI) can significantly speed up the drug discovery process. Bioassay methods are time-consuming and expensive to screen for each pair of drug proteins. Machine-learning-based methods cannot accurately predict a large number of DPIs. Compared with traditional computing methods, deep learning methods need less domain knowledge and have strong data learning ability. In this study, we construct a DPI prediction model based on dual channel neural networks with an efficient path attention mechanism, called DCA-DPI. The drug molecular graph and protein sequence are used as the data input of the model, and the residual graph neural network and the residual convolution network are used to learn the feature representation of the drug and protein, respectively, to obtain the feature vector of the drug and the hidden vector of protein. To get a more accurate protein feature vector, the weighted sum of the hidden vector of protein is applied using the neural attention mechanism. In the end, drug and protein vectors are concatenated and input into the full connection layer for classification. In order to evaluate the performance of DCA-DPI, three widely used public data, Human, C.elegans and DUD-E, are used in the experiment. The evaluation metrics values in the experiment are superior to other relevant methods. Experiments show that our model is efficient for DPI prediction.

精确识别药物-蛋白质相互作用(DPI)可大大加快药物发现过程。生物测定方法筛选每一对药物蛋白既耗时又昂贵。基于机器学习的方法无法准确预测大量的 DPI。与传统计算方法相比,深度学习方法需要的领域知识更少,数据学习能力更强。在本研究中,我们构建了一种基于双通道神经网络和高效路径注意机制的DPI预测模型,称为DCA-DPI。以药物分子图谱和蛋白质序列作为模型的数据输入,利用残差图神经网络和残差卷积网络分别学习药物和蛋白质的特征表示,得到药物的特征向量和蛋白质的隐向量。为了得到更精确的蛋白质特征向量,利用神经注意机制对蛋白质的隐藏向量进行加权求和。最后,将药物和蛋白质向量连接起来,输入全连接层进行分类。为了评估 DCA-DPI 的性能,实验中使用了三种广泛使用的公共数据,即人类、秀丽隐杆线虫和 DUD-E。实验中的评价指标值优于其他相关方法。实验表明,我们的模型在 DPI 预测方面是高效的。
{"title":"Prediction of drug-protein interaction based on dual channel neural networks with attention mechanism.","authors":"Dayu Tan, Haijun Jiang, Haitao Li, Ying Xie, Yansen Su","doi":"10.1093/bfgp/elad037","DOIUrl":"10.1093/bfgp/elad037","url":null,"abstract":"<p><p>The precise identification of drug-protein inter action (DPI) can significantly speed up the drug discovery process. Bioassay methods are time-consuming and expensive to screen for each pair of drug proteins. Machine-learning-based methods cannot accurately predict a large number of DPIs. Compared with traditional computing methods, deep learning methods need less domain knowledge and have strong data learning ability. In this study, we construct a DPI prediction model based on dual channel neural networks with an efficient path attention mechanism, called DCA-DPI. The drug molecular graph and protein sequence are used as the data input of the model, and the residual graph neural network and the residual convolution network are used to learn the feature representation of the drug and protein, respectively, to obtain the feature vector of the drug and the hidden vector of protein. To get a more accurate protein feature vector, the weighted sum of the hidden vector of protein is applied using the neural attention mechanism. In the end, drug and protein vectors are concatenated and input into the full connection layer for classification. In order to evaluate the performance of DCA-DPI, three widely used public data, Human, C.elegans and DUD-E, are used in the experiment. The evaluation metrics values in the experiment are superior to other relevant methods. Experiments show that our model is efficient for DPI prediction.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"286-294"},"PeriodicalIF":4.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10112268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast cancer prognosis through the use of multi-modal classifiers: current state of the art and the way forward 通过使用多模态分类器进行乳腺癌预后分析:技术现状与未来方向
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-05-01 DOI: 10.1093/bfgp/elae015
Archana Mathur, Nikhilanand Arya, Kitsuchart Pasupa, Sriparna Saha, Sudeepa Roy Dey, Snehanshu Saha
We present a survey of the current state-of-the-art in breast cancer detection and prognosis. We analyze the evolution of Artificial Intelligence-based approaches from using just uni-modal information to multi-modality for detection and how such paradigm shift facilitates the efficacy of detection, consistent with clinical observations. We conclude that interpretable AI-based predictions and ability to handle class imbalance should be considered priority.
我们介绍了当前乳腺癌检测和预后的最新进展。我们分析了基于人工智能的方法从仅使用单模态信息到多模态检测的演变过程,以及这种模式转变如何促进检测的有效性,并与临床观察结果保持一致。我们的结论是,应优先考虑基于人工智能的可解释预测和处理类别不平衡的能力。
{"title":"Breast cancer prognosis through the use of multi-modal classifiers: current state of the art and the way forward","authors":"Archana Mathur, Nikhilanand Arya, Kitsuchart Pasupa, Sriparna Saha, Sudeepa Roy Dey, Snehanshu Saha","doi":"10.1093/bfgp/elae015","DOIUrl":"https://doi.org/10.1093/bfgp/elae015","url":null,"abstract":"We present a survey of the current state-of-the-art in breast cancer detection and prognosis. We analyze the evolution of Artificial Intelligence-based approaches from using just uni-modal information to multi-modality for detection and how such paradigm shift facilitates the efficacy of detection, consistent with clinical observations. We conclude that interpretable AI-based predictions and ability to handle class imbalance should be considered priority.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"124 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Short-homology-mediated PCR-based method for gene introduction in the fission yeast Schizosaccharomyces pombe 基于短同源物介导的 PCR 方法在裂殖酵母中引入基因
IF 4 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-04-26 DOI: 10.1093/bfgp/elae016
Cai-Xia Zhang, Ying-Chun Hou
Schizosaccharomyces pombe is a commonly utilized model organism for studying various aspects of eukaryotic cell physiology. One reason for its widespread use as an experimental system is the ease of genetic manipulations, leveraging the natural homology-targeted repair mechanism to accurately modify the genome. We conducted a study to assess the feasibility and efficiency of directly introducing exogenous genes into the fission yeast S. pombe using Polymerase Chain Reaction (PCR) with short-homology flanking sequences. Specifically, we amplified the NatMX6 gene (which provides resistance to nourseothricin) using PCR with oligonucleotides that had short flanking regions of 20 bp, 40 bp, 60 bp and 80 bp to the target gene. By using this purified PCR product, we successfully introduced the NatMX6 gene at position 171 385 on chromosome III in S. pombe. We have made a simple modification to the transformation procedure, resulting in a significant increase in transformation efficiency by at least 5-fold. The success rate of gene integration at the target position varied between 20% and 50% depending on the length of the flanking regions. Additionally, we discovered that the addition of dimethyl sulfoxide and boiled carrier DNA increased the number of transformants by ~60- and 3-fold, respectively. Furthermore, we found that the removal of the pku70+ gene improved the transformation efficiency to ~5% and reduced the formation of small background colonies. Overall, our results demonstrate that with this modified method, even very short stretches of homologous regions (as short as 20 bp) can be used to effectively target genes at a high frequency in S. pombe. This finding greatly facilitates the introduction of exogenous genes in this organism.
在研究真核细胞生理学的各方面问题时,鼠李糖核酶是一种常用的模式生物。它被广泛用作实验系统的原因之一是其易于进行基因操作,利用天然的同源性靶向修复机制来精确地修改基因组。我们进行了一项研究,评估利用聚合酶链式反应(PCR)和短同源侧翼序列将外源基因直接导入裂殖酵母 S. pombe 的可行性和效率。具体来说,我们使用与目标基因侧翼区分别为 20 bp、40 bp、60 bp 和 80 bp 的寡核苷酸进行 PCR 扩增 NatMX6 基因(该基因对诺索三嗪具有抗性)。通过使用这种纯化的 PCR 产物,我们成功地将 NatMX6 基因导入了 S. pombe 的 III 号染色体 171 385 位。我们对转化程序进行了简单修改,使转化效率显著提高了至少 5 倍。根据侧翼区域的长度,目标位置的基因整合成功率在 20% 到 50% 之间。此外,我们还发现,加入二甲基亚砜和煮沸的载体 DNA 可使转化子的数量分别增加约 60 倍和 3 倍。此外,我们还发现去除 pku70+ 基因可将转化效率提高到约 5%,并减少小背景菌落的形成。总之,我们的研究结果表明,使用这种改进的方法,即使是很短的同源区段(短至 20 bp)也能有效地高频率靶向 S. pombe 中的基因。这一发现极大地促进了外源基因在该生物体内的引入。
{"title":"Short-homology-mediated PCR-based method for gene introduction in the fission yeast Schizosaccharomyces pombe","authors":"Cai-Xia Zhang, Ying-Chun Hou","doi":"10.1093/bfgp/elae016","DOIUrl":"https://doi.org/10.1093/bfgp/elae016","url":null,"abstract":"Schizosaccharomyces pombe is a commonly utilized model organism for studying various aspects of eukaryotic cell physiology. One reason for its widespread use as an experimental system is the ease of genetic manipulations, leveraging the natural homology-targeted repair mechanism to accurately modify the genome. We conducted a study to assess the feasibility and efficiency of directly introducing exogenous genes into the fission yeast S. pombe using Polymerase Chain Reaction (PCR) with short-homology flanking sequences. Specifically, we amplified the NatMX6 gene (which provides resistance to nourseothricin) using PCR with oligonucleotides that had short flanking regions of 20 bp, 40 bp, 60 bp and 80 bp to the target gene. By using this purified PCR product, we successfully introduced the NatMX6 gene at position 171 385 on chromosome III in S. pombe. We have made a simple modification to the transformation procedure, resulting in a significant increase in transformation efficiency by at least 5-fold. The success rate of gene integration at the target position varied between 20% and 50% depending on the length of the flanking regions. Additionally, we discovered that the addition of dimethyl sulfoxide and boiled carrier DNA increased the number of transformants by ~60- and 3-fold, respectively. Furthermore, we found that the removal of the pku70+ gene improved the transformation efficiency to ~5% and reduced the formation of small background colonies. Overall, our results demonstrate that with this modified method, even very short stretches of homologous regions (as short as 20 bp) can be used to effectively target genes at a high frequency in S. pombe. This finding greatly facilitates the introduction of exogenous genes in this organism.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"58 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140798721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Briefings in Functional Genomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1