Frontiers in bioinformatics最新文献_第10页

Editorial: Recent advances in peptide informatics: challenges and opportunities. 社论：肽信息学的最新进展：挑战与机遇。

Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-09-12 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1271932

Rahul Kumar, Kumardeep Chaudhary, Sandeep Kumar Dhanda

Peptide informatics is a rapidly growing field that is at the intersection of bioinformatics, chemistry, and biology. Peptides are short chains of amino acids that play important roles in a wide variety of biological processes, such as protein folding, signal transduction, and immune function. Peptide informatics is the use of computational methods to study peptides and their sequence, structure, function, and interactions. Recent advances in peptide informatics have led to a number of new discoveries and applications. For example, new methods have been developed to predict the structure of peptides, which can be used to design new drugs and therapies. New methods for identifying peptide-protein interactions have also been introduced, which can be used to understand the molecular basis of disease.

引用次数: 0

The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen. 真核生物的起源和复杂性的增加与氧气的增加是同步的。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-09-01 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1233281

Jack M Craig, Sudhir Kumar, S Blair Hedges

The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.

真核生物的起源是生命史上最重要的事件之一，产生了一个新的进化谱系，导致了所有复杂的多细胞生物。然而，这一事件的时间安排对于理解其环境背景至关重要，一直很难确定。化石和生物标志物记录稀少，分子钟迄今尚未达成共识，关键节点的日期跨度为21-0.91亿年前（Ga）。值得注意的是，对真核生物最后一个共同祖先的分子时间估计通常比大氧化事件（GOE，2.43-2.22 Ga）年轻数亿年，这导致研究人员质疑真核生物与氧气之间的假定联系。我们使用古菌和细菌起源的遗传数据获得了对真核生物起源的新的时间估计，后者在过去的研究中很少使用。我们还避免了可能影响早期研究的潜在校准偏差。对于真核生物的起源，我们获得了2.2至1.5 Ga的保守区间，2.0至1.8 Ga的核心区间甚至更窄，这一时期与氧气的增加密切相关。我们使用三种通用的衡量标准：细胞类型、基因和基因组大小，进一步重建了整个生命树的生物复杂性历史。我们发现复杂性的增加在时间上与氧气的增加一致，并遵循类似的模式。这表明了一种因果关系，源于氧气满足的复杂生命的能量需求增加。

{"title":"The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen.","authors":"Jack M Craig, Sudhir Kumar, S Blair Hedges","doi":"10.3389/fbinf.2023.1233281","DOIUrl":"10.3389/fbinf.2023.1233281","url":null,"abstract":"The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1233281"},"PeriodicalIF":2.8,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10505794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41142624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets. 用于改进生物数据集 MDS 嵌入的正交离群点检测和维度估计。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-08-10 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1211819

Wanxin Li, Jules Mirone, Ashok Prasad, Nina Miolane, Carine Legrand, Khanh Dao Duc

Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.

传统的降维方法（如多维缩放（MDS））对正交离群值的存在很敏感，从而导致嵌入中的重大缺陷。我们介绍了一种稳健的 MDS 方法，称为 DeCOr-MDS（使用 MDS 检测和校正正交离群值），它基于数据点形成的简约的几何形状和统计数据，可以检测正交离群值，进而降低维度。我们利用合成数据集验证了我们的方法，并进一步展示了如何将其应用于各种大型真实生物数据集，包括癌症图像细胞数据、人类微生物组项目数据和单细胞 RNA 测序数据，以解决数据清理和可视化任务。

引用次数: 0

Molecular timetrees using relaxed clocks and uncertain phylogenies. 使用松弛时钟和不确定系统发育的分子时间树。

Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-08-03 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1225807

Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar

A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.

分子系统学中的一种常见做法是推断系统发育，然后使用放松的时钟方法和校准将其按时间缩放。这种顺序分析实践忽略了系统发育不确定性对分歧时间估计及其置信区间的影响。另一种选择是联合推断系统发育和时间，将系统发育错误纳入分子年代测定中。我们使用计算机模拟和经验数据集比较了这两种替代方案在重建进化时间树方面的性能。我们发现，除了特定情况下的一些节点外，序列和联合分析可以产生相似的分化时间和系统发育关系。当系统发育没有很好地解决时，联合推理表现更好，在这种情况下，联合推理应该是首选的。然而，联合推理对于大型数据集可能是不可行的，因为可用的贝叶斯方法在计算上是繁重的。我们提出了一种联合推断的替代方法，该方法结合了少量自举、最大似然和RelTime方法，用于同时推断进化关系、分歧时间和置信区间，并结合了系统发育的不确定性。新方法减轻了贝叶斯方法带来的高计算负担，同时获得了类似的结果。

{"title":"Molecular timetrees using relaxed clocks and uncertain phylogenies.","authors":"Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar","doi":"10.3389/fbinf.2023.1225807","DOIUrl":"10.3389/fbinf.2023.1225807","url":null,"abstract":"A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1225807"},"PeriodicalIF":0.0,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

XMR: an explainable multimodal neural network for drug response prediction. XMR：用于药物反应预测的可解释多模态神经网络。

Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-08-02 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1164482

Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang

Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the "black box" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.

简介现有的大规模临床前癌症药物反应数据库为我们提供了一个发现和预测潜在有效抗癌药物的绝佳机会。建立在这些数据库上的深度学习模型已被开发并应用于解决癌症药物反应预测任务。事实证明，它们的预测效果明显优于传统的机器学习方法。然而，由于 "黑箱 "特性，这些深度学习模型很难得出忠实于生物学的解释。有人提出了依赖可见神经网络（VNN）的可解释深度学习模型，为预测结果提供生物学依据。然而，它们的性能并没有达到应用于临床实践的预期。方法：在本文中，我们开发了一种 XMR 模型，一种用于药物反应预测的可扩展多模态神经网络。XMR 是一种新的紧凑型多模态神经网络，由两个子网络组成：用于学习基因组特征的可见神经网络和用于学习药物结构特征的图神经网络（GNN）。这两个子网络被集成到一个多模态融合层中，为给定基因突变和药物分子结构的药物反应建模。此外，我们还采用了一种剪枝方法，以更好地解释 XMR 模型。我们使用从 Reactome 通路数据库中获取的五个通路层次（细胞周期、DNA 修复、疾病、信号转导和新陈代谢）作为 XMR 模型的 VNN 架构，以预测三阴性乳腺癌的药物反应。结果我们发现，我们的模型在预测性能方面优于其他最先进的可解释深度学习模型。此外，我们的模型还能为解释三阴性乳腺癌的药物反应提供生物学见解。讨论总的来说，XMR 在多模态融合层中结合了 VNN 和 GNN，捕捉到了关键的基因组和分子特征，并在生物学方面提供了合理的可解释性，从而更好地预测癌症患者的药物反应。我们的模型也将有益于未来的个性化癌症治疗。

{"title":"XMR: an explainable multimodal neural network for drug response prediction.","authors":"Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang","doi":"10.3389/fbinf.2023.1164482","DOIUrl":"10.3389/fbinf.2023.1164482","url":null,"abstract":"Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the \"black box\" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1164482"},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10039829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism. 在数据稀少的情况下，3'RNA-seq 优于标准 RNA-seq，但在确定模式生物的毒性通路方面，3'RNA-seq 则逊色于标准 RNA-seq。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-07-27 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1234218

Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay

Introduction: The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (Danio rerio), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. Methods and results: Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. Conclusions: These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.

引言RNA 测序技术的应用为研究复杂生物系统中的基因表达水平带来了诸多突破。其中包括了解生物体（如脊椎动物模式生物斑马鱼（Danio rerio））如何对毒物暴露做出反应。最近，3'RNA-seq 的发展使得基因表达水平的测定只需要标准 RNA-seq 所需的一小部分读数。虽然 3' RNA-seq 有很多优点，但在整个生物体毒性和数据稀少的情况下，还没有与标准 RNA-seq 进行过比较。方法与结果在此，我们用 3' RNA-seq 或标准 RNA-seq 对暴露于全氟丁烷磺酰胺（FBSA）的斑马鱼样本进行了研究，以确定两者在识别功能富集通路方面的优势。我们发现，3'RNA-seq 和标准 RNA-seq 在关注基因组中已注释或未注释的区域时表现出特定的优势。我们还发现，标准 RNA-seq 能鉴定出更多的差异表达基因 (DEG)，但在数据稀少的情况下，这种优势就会消失。我们还发现，通过分析 DEG 列表，标准 RNA-seq 在识别功能富集通路方面具有显著优势，但通过对所有基因进行基因组富集分析来识别通路时，这种优势则微乎其微。结论：这些结果表明，每种方法都有可能在某些实验条件下发挥优势。我们的观察结果有助于指导其他人选择 3' RNA-seq 与标准 RNA 测序来查询一系列生物系统中的基因表达水平。

{"title":"3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism.","authors":"Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay","doi":"10.3389/fbinf.2023.1234218","DOIUrl":"10.3389/fbinf.2023.1234218","url":null,"abstract":"Introduction: The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (Danio rerio), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. Methods and results: Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. Conclusions: These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1234218"},"PeriodicalIF":2.8,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9990456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Digital staining facilitates biomedical microscopy. 数字染色为生物医学显微镜提供了便利。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-07-26 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1243663

Michael John Fanous, Nir Pillar, Aydogan Ozcan

Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational "virtual" staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.

传统的显微成像生物标本染色过程耗时、费力且成本高昂，此外还会产生不一致的标记并造成不可逆的标本损伤。近年来，使用深度学习技术的计算 "虚拟 "染色技术已发展成为一种强大而全面的应用，可简化染色过程，而不会产生典型的组织化学染色相关弊端。这种虚拟染色技术还可以与神经网络相结合，旨在纠正各种显微镜像差，如焦外差或运动模糊伪影，并提高衍射限制分辨率。在此，我们将重点介绍此类方法如何带来大量新机遇，从而显著改善生物医学显微镜的样品制备和成像。

引用次数: 0

A review on deep learning applications in highly multiplexed tissue imaging data analysis. 深度学习在高度复用组织成像数据分析中的应用综述。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-07-26 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1159381

Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch

Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") ima

自引入肿瘤学领域以来，深度学习（DL）影响了临床发现和生物标志物预测。DL驱动的肿瘤学发现和预测基于各种生物学数据，如基因组学、蛋白质组学和成像数据。基于DL的计算框架可以预测遗传变异对基因表达的影响，以及基于氨基酸序列的蛋白质结构。此外，DL算法可以从几种空间“组学”技术中捕获有价值的机制生物学信息，如空间转录组学和空间蛋白质组学。在此，我们回顾了人工智能（AI）与空间组学技术的结合对肿瘤学的影响，重点介绍了DL及其在生物医学图像分析中的应用，包括细胞分割、细胞表型识别、癌症预测和治疗预测。与单一染色的传统组织病理学（“简单”）图像相比，我们强调了使用高度复用图像（空间蛋白质组学数据）的优势，因为前者可以提供后者无法获得的深层机制见解，即使有可解释的人工智能的帮助。此外，我们向读者提供了在预处理高度复用的图像（细胞分割、细胞类型注释）中使用的基于DL的流水线的优点/缺点。因此，本综述还指导读者选择最适合其数据的基于DL的管道。总之，当与高度复用的组织成像数据等技术相结合时，DL继续被确立为发现新的生物学机制的重要工具。与传统医学数据相比，它在临床常规中的作用将变得更加重要，支持肿瘤学的诊断和预后，增强临床决策，提高患者的护理质量。自引入肿瘤学领域以来，深度学习（DL）影响了临床发现和生物标志物预测。DL驱动的肿瘤学发现和预测基于各种生物学数据，如基因组学、蛋白质组学和成像数据。基于DL的计算框架可以预测遗传变异对基因表达的影响，以及基于氨基酸序列的蛋白质结构。此外，DL算法可以从几种空间“组学”技术中捕获有价值的机制生物学信息，如空间转录组学和空间蛋白质组学。在此，我们回顾了人工智能（AI）与空间组学技术的结合对肿瘤学的影响，重点介绍了DL及其在生物医学图像分析中的应用，包括细胞分割、细胞表型识别、癌症预测和治疗预测。与单一染色的传统组织病理学（“简单”）图像相比，我们强调了使用高度复用图像（空间蛋白质组学数据）的优势，因为前者可以提供后者无法获得的深层机制见解，即使有可解释的人工智能的帮助。此外，我们向读者提供了在预处理高度复用的图像（细胞分割、细胞类型注释）中使用的基于DL的流水线的优点/缺点。因此，本综述还指导读者选择最适合其数据的基于DL的管道。总之，当与高度复用的组织成像数据等技术相结合时，DL继续被确立为发现新的生物学机制的重要工具。与传统医学数据相比，它在临床常规中的作用将变得更加重要，支持肿瘤学的诊断和预后，增强临床决策，提高患者的护理质量。

{"title":"A review on deep learning applications in highly multiplexed tissue imaging data analysis.","authors":"Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch","doi":"10.3389/fbinf.2023.1159381","DOIUrl":"10.3389/fbinf.2023.1159381","url":null,"abstract":"Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") ima","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1159381"},"PeriodicalIF":2.8,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10410935/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9978648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Moving beyond the desktop: prospects for practical bioimage analysis via the web. 超越桌面：通过网络进行实用生物图像分析的前景。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-07-25 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1233748

Wei Ouyang, Kevin W Eliceiri, Beth A Cimini

As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.

随着生物成像技术的飞速发展，图像数据也越来越复杂，因此有必要重新评估传统的生物图像分析方法及其可访问性。这一观点强调了我们的信念，即从基于桌面的工具过渡到基于网络的生物图像分析，可以为提高可访问性、加强协作和简化工作流程带来巨大的机遇。我们概述了潜在的好处，如减少本地计算需求和解决常见挑战，包括软件安装问题和有限的可重复性。此外，我们还探讨了网络工具的现状、实施过程中的障碍以及科学界集体参与推动这一转变的意义。考虑到数据管理的潜在障碍和复杂性，我们建议采用选择性原型开发和大规模工作流程应用相结合的方法，以达到最佳使用效果。采用基于网络的生物图像分析可为生命科学界加速生物研究铺平道路，为更具协作性、效率和民主化的科学提供一个强大的平台。

{"title":"Moving beyond the desktop: prospects for practical bioimage analysis via the web.","authors":"Wei Ouyang, Kevin W Eliceiri, Beth A Cimini","doi":"10.3389/fbinf.2023.1233748","DOIUrl":"10.3389/fbinf.2023.1233748","url":null,"abstract":"As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1233748"},"PeriodicalIF":2.8,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10409478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10005434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geometric deep learning as a potential tool for antimicrobial peptide prediction. 几何深度学习作为抗菌肽预测的潜在工具。

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2023-07-13 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1216362

Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco

Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.

抗菌肽（AMPs）是抵御病原体入侵的天然免疫成分。它们是能折叠成各种三维结构的聚合物，其基本序列最好在非平面空间中表示，从而实现其功能。AMPs 的结构数据表现出非欧几里得特征，这意味着在非欧几里得空间中，某些属性，如微分流形、共同坐标系、矢量空间结构或平移-方差，以及卷积等基本操作，并没有明确建立起来。几何深度学习（GDL）是指一类利用深度神经模型在非欧几里得环境（如图和流形）中处理和分析数据的机器学习方法。这一新兴领域旨在将结构化模型的使用扩展到这些领域。本综述详细总结了利用 GDL 技术设计和预测 AMP 的最新进展，并讨论了该领域当前的研究差距和未来方向。

{"title":"Geometric deep learning as a potential tool for antimicrobial peptide prediction.","authors":"Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco","doi":"10.3389/fbinf.2023.1216362","DOIUrl":"10.3389/fbinf.2023.1216362","url":null,"abstract":"Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1216362"},"PeriodicalIF":2.8,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10374423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0