首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
Molecular timetrees using relaxed clocks and uncertain phylogenies. 使用松弛时钟和不确定系统发育的分子时间树。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-03 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1225807
Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar

A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.

分子系统学中的一种常见做法是推断系统发育,然后使用放松的时钟方法和校准将其按时间缩放。这种顺序分析实践忽略了系统发育不确定性对分歧时间估计及其置信区间的影响。另一种选择是联合推断系统发育和时间,将系统发育错误纳入分子年代测定中。我们使用计算机模拟和经验数据集比较了这两种替代方案在重建进化时间树方面的性能。我们发现,除了特定情况下的一些节点外,序列和联合分析可以产生相似的分化时间和系统发育关系。当系统发育没有很好地解决时,联合推理表现更好,在这种情况下,联合推理应该是首选的。然而,联合推理对于大型数据集可能是不可行的,因为可用的贝叶斯方法在计算上是繁重的。我们提出了一种联合推断的替代方法,该方法结合了少量自举、最大似然和RelTime方法,用于同时推断进化关系、分歧时间和置信区间,并结合了系统发育的不确定性。新方法减轻了贝叶斯方法带来的高计算负担,同时获得了类似的结果。
{"title":"Molecular timetrees using relaxed clocks and uncertain phylogenies.","authors":"Jose Barba-Montoya,&nbsp;Sudip Sharma,&nbsp;Sudhir Kumar","doi":"10.3389/fbinf.2023.1225807","DOIUrl":"10.3389/fbinf.2023.1225807","url":null,"abstract":"<p><p>A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1225807"},"PeriodicalIF":0.0,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XMR: an explainable multimodal neural network for drug response prediction. XMR:用于药物反应预测的可解释多模态神经网络。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-02 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1164482
Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang

Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the "black box" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.

简介现有的大规模临床前癌症药物反应数据库为我们提供了一个发现和预测潜在有效抗癌药物的绝佳机会。建立在这些数据库上的深度学习模型已被开发并应用于解决癌症药物反应预测任务。事实证明,它们的预测效果明显优于传统的机器学习方法。然而,由于 "黑箱 "特性,这些深度学习模型很难得出忠实于生物学的解释。有人提出了依赖可见神经网络(VNN)的可解释深度学习模型,为预测结果提供生物学依据。然而,它们的性能并没有达到应用于临床实践的预期。方法:在本文中,我们开发了一种 XMR 模型,一种用于药物反应预测的可扩展多模态神经网络。XMR 是一种新的紧凑型多模态神经网络,由两个子网络组成:用于学习基因组特征的可见神经网络和用于学习药物结构特征的图神经网络(GNN)。这两个子网络被集成到一个多模态融合层中,为给定基因突变和药物分子结构的药物反应建模。此外,我们还采用了一种剪枝方法,以更好地解释 XMR 模型。我们使用从 Reactome 通路数据库中获取的五个通路层次(细胞周期、DNA 修复、疾病、信号转导和新陈代谢)作为 XMR 模型的 VNN 架构,以预测三阴性乳腺癌的药物反应。结果我们发现,我们的模型在预测性能方面优于其他最先进的可解释深度学习模型。此外,我们的模型还能为解释三阴性乳腺癌的药物反应提供生物学见解。讨论总的来说,XMR 在多模态融合层中结合了 VNN 和 GNN,捕捉到了关键的基因组和分子特征,并在生物学方面提供了合理的可解释性,从而更好地预测癌症患者的药物反应。我们的模型也将有益于未来的个性化癌症治疗。
{"title":"XMR: an explainable multimodal neural network for drug response prediction.","authors":"Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang","doi":"10.3389/fbinf.2023.1164482","DOIUrl":"10.3389/fbinf.2023.1164482","url":null,"abstract":"<p><p><b>Introduction:</b> Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the \"black box\" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. <b>Methods:</b> In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. <b>Results:</b> We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. <b>Discussion:</b> Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1164482"},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10039829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism. 在数据稀少的情况下,3'RNA-seq 优于标准 RNA-seq,但在确定模式生物的毒性通路方面,3'RNA-seq 则逊色于标准 RNA-seq。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-27 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1234218
Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay

Introduction: The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (Danio rerio), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. Methods and results: Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. Conclusions: These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.

引言RNA 测序技术的应用为研究复杂生物系统中的基因表达水平带来了诸多突破。其中包括了解生物体(如脊椎动物模式生物斑马鱼(Danio rerio))如何对毒物暴露做出反应。最近,3'RNA-seq 的发展使得基因表达水平的测定只需要标准 RNA-seq 所需的一小部分读数。虽然 3' RNA-seq 有很多优点,但在整个生物体毒性和数据稀少的情况下,还没有与标准 RNA-seq 进行过比较。方法与结果在此,我们用 3' RNA-seq 或标准 RNA-seq 对暴露于全氟丁烷磺酰胺(FBSA)的斑马鱼样本进行了研究,以确定两者在识别功能富集通路方面的优势。我们发现,3'RNA-seq 和标准 RNA-seq 在关注基因组中已注释或未注释的区域时表现出特定的优势。我们还发现,标准 RNA-seq 能鉴定出更多的差异表达基因 (DEG),但在数据稀少的情况下,这种优势就会消失。我们还发现,通过分析 DEG 列表,标准 RNA-seq 在识别功能富集通路方面具有显著优势,但通过对所有基因进行基因组富集分析来识别通路时,这种优势则微乎其微。结论:这些结果表明,每种方法都有可能在某些实验条件下发挥优势。我们的观察结果有助于指导其他人选择 3' RNA-seq 与标准 RNA 测序来查询一系列生物系统中的基因表达水平。
{"title":"3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism.","authors":"Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay","doi":"10.3389/fbinf.2023.1234218","DOIUrl":"10.3389/fbinf.2023.1234218","url":null,"abstract":"<p><p><b>Introduction:</b> The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (<i>Danio rerio</i>), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. <b>Methods and results:</b> Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. <b>Conclusions:</b> These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1234218"},"PeriodicalIF":2.8,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9990456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital staining facilitates biomedical microscopy. 数字染色为生物医学显微镜提供了便利。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-26 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1243663
Michael John Fanous, Nir Pillar, Aydogan Ozcan

Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational "virtual" staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.

传统的显微成像生物标本染色过程耗时、费力且成本高昂,此外还会产生不一致的标记并造成不可逆的标本损伤。近年来,使用深度学习技术的计算 "虚拟 "染色技术已发展成为一种强大而全面的应用,可简化染色过程,而不会产生典型的组织化学染色相关弊端。这种虚拟染色技术还可以与神经网络相结合,旨在纠正各种显微镜像差,如焦外差或运动模糊伪影,并提高衍射限制分辨率。在此,我们将重点介绍此类方法如何带来大量新机遇,从而显著改善生物医学显微镜的样品制备和成像。
{"title":"Digital staining facilitates biomedical microscopy.","authors":"Michael John Fanous, Nir Pillar, Aydogan Ozcan","doi":"10.3389/fbinf.2023.1243663","DOIUrl":"10.3389/fbinf.2023.1243663","url":null,"abstract":"<p><p>Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational \"virtual\" staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1243663"},"PeriodicalIF":2.8,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10411189/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9969679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review on deep learning applications in highly multiplexed tissue imaging data analysis. 深度学习在高度复用组织成像数据分析中的应用综述。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-26 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1159381
Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch
<p><p>Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") ima
自引入肿瘤学领域以来,深度学习(DL)影响了临床发现和生物标志物预测。DL驱动的肿瘤学发现和预测基于各种生物学数据,如基因组学、蛋白质组学和成像数据。基于DL的计算框架可以预测遗传变异对基因表达的影响,以及基于氨基酸序列的蛋白质结构。此外,DL算法可以从几种空间“组学”技术中捕获有价值的机制生物学信息,如空间转录组学和空间蛋白质组学。在此,我们回顾了人工智能(AI)与空间组学技术的结合对肿瘤学的影响,重点介绍了DL及其在生物医学图像分析中的应用,包括细胞分割、细胞表型识别、癌症预测和治疗预测。与单一染色的传统组织病理学(“简单”)图像相比,我们强调了使用高度复用图像(空间蛋白质组学数据)的优势,因为前者可以提供后者无法获得的深层机制见解,即使有可解释的人工智能的帮助。此外,我们向读者提供了在预处理高度复用的图像(细胞分割、细胞类型注释)中使用的基于DL的流水线的优点/缺点。因此,本综述还指导读者选择最适合其数据的基于DL的管道。总之,当与高度复用的组织成像数据等技术相结合时,DL继续被确立为发现新的生物学机制的重要工具。与传统医学数据相比,它在临床常规中的作用将变得更加重要,支持肿瘤学的诊断和预后,增强临床决策,提高患者的护理质量。自引入肿瘤学领域以来,深度学习(DL)影响了临床发现和生物标志物预测。DL驱动的肿瘤学发现和预测基于各种生物学数据,如基因组学、蛋白质组学和成像数据。基于DL的计算框架可以预测遗传变异对基因表达的影响,以及基于氨基酸序列的蛋白质结构。此外,DL算法可以从几种空间“组学”技术中捕获有价值的机制生物学信息,如空间转录组学和空间蛋白质组学。在此,我们回顾了人工智能(AI)与空间组学技术的结合对肿瘤学的影响,重点介绍了DL及其在生物医学图像分析中的应用,包括细胞分割、细胞表型识别、癌症预测和治疗预测。与单一染色的传统组织病理学(“简单”)图像相比,我们强调了使用高度复用图像(空间蛋白质组学数据)的优势,因为前者可以提供后者无法获得的深层机制见解,即使有可解释的人工智能的帮助。此外,我们向读者提供了在预处理高度复用的图像(细胞分割、细胞类型注释)中使用的基于DL的流水线的优点/缺点。因此,本综述还指导读者选择最适合其数据的基于DL的管道。总之,当与高度复用的组织成像数据等技术相结合时,DL继续被确立为发现新的生物学机制的重要工具。与传统医学数据相比,它在临床常规中的作用将变得更加重要,支持肿瘤学的诊断和预后,增强临床决策,提高患者的护理质量。
{"title":"A review on deep learning applications in highly multiplexed tissue imaging data analysis.","authors":"Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch","doi":"10.3389/fbinf.2023.1159381","DOIUrl":"10.3389/fbinf.2023.1159381","url":null,"abstract":"&lt;p&gt;&lt;p&gt;Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") ima","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1159381"},"PeriodicalIF":2.8,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10410935/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9978648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moving beyond the desktop: prospects for practical bioimage analysis via the web. 超越桌面:通过网络进行实用生物图像分析的前景。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-25 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1233748
Wei Ouyang, Kevin W Eliceiri, Beth A Cimini

As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.

随着生物成像技术的飞速发展,图像数据也越来越复杂,因此有必要重新评估传统的生物图像分析方法及其可访问性。这一观点强调了我们的信念,即从基于桌面的工具过渡到基于网络的生物图像分析,可以为提高可访问性、加强协作和简化工作流程带来巨大的机遇。我们概述了潜在的好处,如减少本地计算需求和解决常见挑战,包括软件安装问题和有限的可重复性。此外,我们还探讨了网络工具的现状、实施过程中的障碍以及科学界集体参与推动这一转变的意义。考虑到数据管理的潜在障碍和复杂性,我们建议采用选择性原型开发和大规模工作流程应用相结合的方法,以达到最佳使用效果。采用基于网络的生物图像分析可为生命科学界加速生物研究铺平道路,为更具协作性、效率和民主化的科学提供一个强大的平台。
{"title":"Moving beyond the desktop: prospects for practical bioimage analysis via the web.","authors":"Wei Ouyang, Kevin W Eliceiri, Beth A Cimini","doi":"10.3389/fbinf.2023.1233748","DOIUrl":"10.3389/fbinf.2023.1233748","url":null,"abstract":"<p><p>As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1233748"},"PeriodicalIF":2.8,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10409478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10005434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric deep learning as a potential tool for antimicrobial peptide prediction. 几何深度学习作为抗菌肽预测的潜在工具。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-13 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1216362
Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco

Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.

抗菌肽(AMPs)是抵御病原体入侵的天然免疫成分。它们是能折叠成各种三维结构的聚合物,其基本序列最好在非平面空间中表示,从而实现其功能。AMPs 的结构数据表现出非欧几里得特征,这意味着在非欧几里得空间中,某些属性,如微分流形、共同坐标系、矢量空间结构或平移-方差,以及卷积等基本操作,并没有明确建立起来。几何深度学习(GDL)是指一类利用深度神经模型在非欧几里得环境(如图和流形)中处理和分析数据的机器学习方法。这一新兴领域旨在将结构化模型的使用扩展到这些领域。本综述详细总结了利用 GDL 技术设计和预测 AMP 的最新进展,并讨论了该领域当前的研究差距和未来方向。
{"title":"Geometric deep learning as a potential tool for antimicrobial peptide prediction.","authors":"Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco","doi":"10.3389/fbinf.2023.1216362","DOIUrl":"10.3389/fbinf.2023.1216362","url":null,"abstract":"<p><p>Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1216362"},"PeriodicalIF":2.8,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10374423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translating theory into practice: assessing the privacy implications of concept-based explanations for biomedical AI. 将理论转化为实践:评估基于概念的生物医学人工智能解释对隐私的影响。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-05 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1194993
Adriano Lucieri, Andreas Dengel, Sheraz Ahmed

Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.

人工智能(AI)在图像生成、图像分析和语言建模方面取得了显著成就,使数据驱动技术在现实世界的实际应用中越来越重要,有望提高人类用户的创造力和效率。然而,在基础设施和医疗保健等关系重大的领域部署人工智能,仍会引发对算法责任和安全性的担忧。新兴的可解释人工智能(XAI)领域在开发能让人类理解数据驱动模型所做决策的界面方面取得了长足进步。在这些方法中,基于概念的可解释性因其能够将解释与用户熟悉的高级概念相统一而脱颖而出。然而,对抗式机器学习的早期研究揭示,暴露模型解释会使受害模型更容易受到攻击。这是第一项在生物医学图像分析中研究和比较基于概念的解释对基于深度学习的人工智能模型的隐私影响的研究。在两个生物医学数据集(ISIC 和 EyePACS)和一个合成数据集(SCDB)上对三种不同的先进模型架构(ResNet50、NFNet、ConvNeXt)进行了广泛的隐私基准测试。在暴露出不同程度的基于归因和基于概念的解释的同时,系统地比较了成员推理攻击的成功率。研究结果表明,从理论上讲,与基线设置中的归因相比,基于概念的解释有可能使私有人工智能系统的脆弱性增加多达 16%。不过,研究表明,在更现实的攻击场景中,解释所造成的威胁实际上可以忽略不计。此外,还提供了可行的建议,以确保基于概念的 XAI 系统的安全部署。此外,还探讨了差异隐私(DP)对基于概念的解释质量的影响,揭示了DP在对解释能力产生负面影响的同时,也会对模型的隐私产生不利影响。
{"title":"Translating theory into practice: assessing the privacy implications of concept-based explanations for biomedical AI.","authors":"Adriano Lucieri, Andreas Dengel, Sheraz Ahmed","doi":"10.3389/fbinf.2023.1194993","DOIUrl":"10.3389/fbinf.2023.1194993","url":null,"abstract":"<p><p>Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1194993"},"PeriodicalIF":2.8,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10356902/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9918469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of phasor analysis and biexponential decay curve fitting of autofluorescence lifetime imaging data for machine learning prediction of cellular phenotypes. 比较自发荧光寿命成像数据的相位分析和双指数衰减曲线拟合,用于细胞表型的机器学习预测。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-06-29 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1210157
Linghao Hu, Blanche Ter Hofstede, Dhavan Sharma, Feng Zhao, Alex J Walsh

Introduction: Autofluorescence imaging of the coenzymes reduced nicotinamide (phosphate) dinucleotide (NAD(P)H) and oxidized flavin adenine dinucleotide (FAD) provides a label-free method to detect cellular metabolism and phenotypes. Time-domain fluorescence lifetime data can be analyzed by exponential decay fitting to extract fluorescence lifetimes or by a fit-free phasor transformation to compute phasor coordinates. Methods: Here, fluorescence lifetime data analysis by biexponential decay curve fitting is compared with phasor coordinate analysis as input data to machine learning models to predict cell phenotypes. Glycolysis and oxidative phosphorylation of MCF7 breast cancer cells were chemically inhibited with 2-deoxy-d-glucose and sodium cyanide, respectively; and fluorescence lifetime images of NAD(P)H and FAD were obtained using a multiphoton microscope. Results: Machine learning algorithms built from either the extracted lifetime values or phasor coordinates predict MCF7 metabolism with a high accuracy (∼88%). Similarly, fluorescence lifetime images of M0, M1, and M2 macrophages were acquired and analyzed by decay fitting and phasor analysis. Machine learning models trained with features from curve fitting discriminate different macrophage phenotypes with improved performance over models trained using only phasor coordinates. Discussion: Altogether, the results demonstrate that both curve fitting and phasor analysis of autofluorescence lifetime images can be used in machine learning models for classification of cell phenotype from the lifetime data.

简介:还原型烟酰胺(磷酸)二核苷酸(NAD(P)H)和氧化型黄素腺嘌呤二核苷酸(FAD)辅酶的自发荧光成像为检测细胞代谢和表型提供了一种无标记方法。时域荧光寿命数据可通过指数衰减拟合来提取荧光寿命,或通过无拟合相位变换来计算相位坐标。方法:本文将双指数衰减曲线拟合的荧光寿命数据分析与相位坐标分析作为机器学习模型的输入数据进行比较,以预测细胞表型。分别用2-脱氧葡萄糖和氰化钠对MCF7乳腺癌细胞的糖酵解和氧化磷酸化进行化学抑制,并用多光子显微镜获得NAD(P)H和FAD的荧光寿命图像。结果根据提取的荧光寿命值或相位坐标建立的机器学习算法能准确预测 MCF7 的新陈代谢(88%)。同样,M0、M1 和 M2 巨噬细胞的荧光寿命图像也是通过衰减拟合和相位分析获得和分析的。与仅使用相位坐标训练的模型相比,使用曲线拟合特征训练的机器学习模型能更好地分辨不同的巨噬细胞表型。讨论:总之,研究结果表明,自发荧光寿命图像的曲线拟合和相位分析均可用于机器学习模型,以便根据寿命数据对细胞表型进行分类。
{"title":"Comparison of phasor analysis and biexponential decay curve fitting of autofluorescence lifetime imaging data for machine learning prediction of cellular phenotypes.","authors":"Linghao Hu, Blanche Ter Hofstede, Dhavan Sharma, Feng Zhao, Alex J Walsh","doi":"10.3389/fbinf.2023.1210157","DOIUrl":"10.3389/fbinf.2023.1210157","url":null,"abstract":"<p><p><b>Introduction:</b> Autofluorescence imaging of the coenzymes reduced nicotinamide (phosphate) dinucleotide (NAD(P)H) and oxidized flavin adenine dinucleotide (FAD) provides a label-free method to detect cellular metabolism and phenotypes. Time-domain fluorescence lifetime data can be analyzed by exponential decay fitting to extract fluorescence lifetimes or by a fit-free phasor transformation to compute phasor coordinates. <b>Methods:</b> Here, fluorescence lifetime data analysis by biexponential decay curve fitting is compared with phasor coordinate analysis as input data to machine learning models to predict cell phenotypes. Glycolysis and oxidative phosphorylation of MCF7 breast cancer cells were chemically inhibited with 2-deoxy-d-glucose and sodium cyanide, respectively; and fluorescence lifetime images of NAD(P)H and FAD were obtained using a multiphoton microscope. <b>Results:</b> Machine learning algorithms built from either the extracted lifetime values or phasor coordinates predict MCF7 metabolism with a high accuracy (∼88%). Similarly, fluorescence lifetime images of M0, M1, and M2 macrophages were acquired and analyzed by decay fitting and phasor analysis. Machine learning models trained with features from curve fitting discriminate different macrophage phenotypes with improved performance over models trained using only phasor coordinates. <b>Discussion:</b> Altogether, the results demonstrate that both curve fitting and phasor analysis of autofluorescence lifetime images can be used in machine learning models for classification of cell phenotype from the lifetime data.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1210157"},"PeriodicalIF":0.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10342207/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9822065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepRaccess: high-speed RNA accessibility prediction using deep learning DeepRaccess:使用深度学习的高速RNA可及性预测
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-05-25 DOI: 10.1101/2023.05.25.542237
Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada
RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analyses. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess can predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.
RNA可及性是预测原核生物RNA-RNA相互作用和翻译效率的一个有用的RNA二级结构特征。然而,传统的可及性计算工具,如Raccess,在计算上是昂贵的,并且需要大量的计算时间来执行转录组规模的分析。在这项研究中,我们开发了基于深度学习方法预测RNA可及性的DeepRaccess。DeepRaccess训练将人工RNA序列作为输入,并根据Raccess计算的结果预测这些序列的可及性。模拟和实证数据分析表明,DeepRaccess预测的可达性与Raccess计算的可达性高度相关。此外,我们证实了DeepRaccess可以从开始密码子周围的序列预测大肠杆菌中的蛋白质丰度,准确度中等。我们还证明了DeepRaccess在GPU环境下实现了数十到数百倍的软件加速。DeepRaccess的源代码和训练模型可在https://github.com/hmdlab/DeepRaccess免费获得。
{"title":"DeepRaccess: high-speed RNA accessibility prediction using deep learning","authors":"Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada","doi":"10.1101/2023.05.25.542237","DOIUrl":"https://doi.org/10.1101/2023.05.25.542237","url":null,"abstract":"RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analyses. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess can predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84758482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1