首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
Artificial intelligence in imaging flow cytometry. 成像流式细胞术中的人工智能。
Pub Date : 2023-10-09 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1229052
Paolo Pozzi, Alessia Candeo, Petra Paiè, Francesca Bragheri, Andrea Bassi
{"title":"Artificial intelligence in imaging flow cytometry.","authors":"Paolo Pozzi, Alessia Candeo, Petra Paiè, Francesca Bragheri, Andrea Bassi","doi":"10.3389/fbinf.2023.1229052","DOIUrl":"10.3389/fbinf.2023.1229052","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593470/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50159463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeighborNet: improved algorithms and implementation. NeighborNet:改进的算法和实现。
Pub Date : 2023-09-20 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1178600
David Bryant, Daniel H Huson

NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims at computing a circular ordering. We provide the first technical description of the second step, the estimation of split weights. We review the third step by constructing and drawing the network. Finally, we discuss how the networks might best be interpreted, review related approaches, and present some open questions.

NeighborNet构建系统发育网络以可视化距离数据。这是一种广泛应用的流行方法。虽然一些研究已经调查了它的数学特征,但这里我们关注的是计算方面。该算法分为三个步骤。我们提出了第一步的一个新的简化公式,旨在计算循环排序。我们提供了第二步的第一个技术描述,即分割权重的估计。我们通过构建和绘制网络来回顾第三步。最后,我们讨论了如何最好地解释网络,回顾了相关的方法,并提出了一些悬而未决的问题。
{"title":"NeighborNet: improved algorithms and implementation.","authors":"David Bryant,&nbsp;Daniel H Huson","doi":"10.3389/fbinf.2023.1178600","DOIUrl":"10.3389/fbinf.2023.1178600","url":null,"abstract":"<p><p>NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims at computing a circular ordering. We provide the first technical description of the second step, the estimation of split weights. We review the third step by constructing and drawing the network. Finally, we discuss how the networks might best be interpreted, review related approaches, and present some open questions.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10548196/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41161536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VariBench, new variation benchmark categories and data sets. VariBeach,新的变体基准类别和数据集。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-19 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1248732
Niloofar Shirvanizadeh, Mauno Vihinen
{"title":"VariBench, new variation benchmark categories and data sets.","authors":"Niloofar Shirvanizadeh, Mauno Vihinen","doi":"10.3389/fbinf.2023.1248732","DOIUrl":"10.3389/fbinf.2023.1248732","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10546188/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41167306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrigendum: A review on deep learning applications in highly multiplexed tissue imaging data analysis. 更正:深度学习在高度复用组织成像数据分析中的应用综述。
Pub Date : 2023-09-13 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1287407
Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch

[This corrects the article DOI: 10.3389/fbinf.2023.1159381.].

[这更正了文章DOI:10.3389/fbinf.2023.1159381.]。
{"title":"Corrigendum: A review on deep learning applications in highly multiplexed tissue imaging data analysis.","authors":"Mohammed Zidane,&nbsp;Ahmad Makky,&nbsp;Matthias Bruhns,&nbsp;Alexander Rochwarger,&nbsp;Sepideh Babaei,&nbsp;Manfred Claassen,&nbsp;Christian M Schürch","doi":"10.3389/fbinf.2023.1287407","DOIUrl":"10.3389/fbinf.2023.1287407","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fbinf.2023.1159381.].</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10534973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41170250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Recent advances in peptide informatics: challenges and opportunities. 社论:肽信息学的最新进展:挑战与机遇。
Pub Date : 2023-09-12 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1271932
Rahul Kumar, Kumardeep Chaudhary, Sandeep Kumar Dhanda
Peptide informatics is a rapidly growing field that is at the intersection of bioinformatics, chemistry, and biology. Peptides are short chains of amino acids that play important roles in a wide variety of biological processes, such as protein folding, signal transduction, and immune function. Peptide informatics is the use of computational methods to study peptides and their sequence, structure, function, and interactions. Recent advances in peptide informatics have led to a number of new discoveries and applications. For example, new methods have been developed to predict the structure of peptides, which can be used to design new drugs and therapies. New methods for identifying peptide-protein interactions have also been introduced, which can be used to understand the molecular basis of disease.
{"title":"Editorial: Recent advances in peptide informatics: challenges and opportunities.","authors":"Rahul Kumar,&nbsp;Kumardeep Chaudhary,&nbsp;Sandeep Kumar Dhanda","doi":"10.3389/fbinf.2023.1271932","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1271932","url":null,"abstract":"Peptide informatics is a rapidly growing field that is at the intersection of bioinformatics, chemistry, and biology. Peptides are short chains of amino acids that play important roles in a wide variety of biological processes, such as protein folding, signal transduction, and immune function. Peptide informatics is the use of computational methods to study peptides and their sequence, structure, function, and interactions. Recent advances in peptide informatics have led to a number of new discoveries and applications. For example, new methods have been developed to predict the structure of peptides, which can be used to design new drugs and therapies. New methods for identifying peptide-protein interactions have also been introduced, which can be used to understand the molecular basis of disease.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41155909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen. 真核生物的起源和复杂性的增加与氧气的增加是同步的。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-01 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1233281
Jack M Craig, Sudhir Kumar, S Blair Hedges

The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.

真核生物的起源是生命史上最重要的事件之一,产生了一个新的进化谱系,导致了所有复杂的多细胞生物。然而,这一事件的时间安排对于理解其环境背景至关重要,一直很难确定。化石和生物标志物记录稀少,分子钟迄今尚未达成共识,关键节点的日期跨度为21-0.91亿年前(Ga)。值得注意的是,对真核生物最后一个共同祖先的分子时间估计通常比大氧化事件(GOE,2.43-2.22 Ga)年轻数亿年,这导致研究人员质疑真核生物与氧气之间的假定联系。我们使用古菌和细菌起源的遗传数据获得了对真核生物起源的新的时间估计,后者在过去的研究中很少使用。我们还避免了可能影响早期研究的潜在校准偏差。对于真核生物的起源,我们获得了2.2至1.5 Ga的保守区间,2.0至1.8 Ga的核心区间甚至更窄,这一时期与氧气的增加密切相关。我们使用三种通用的衡量标准:细胞类型、基因和基因组大小,进一步重建了整个生命树的生物复杂性历史。我们发现复杂性的增加在时间上与氧气的增加一致,并遵循类似的模式。这表明了一种因果关系,源于氧气满足的复杂生命的能量需求增加。
{"title":"The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen.","authors":"Jack M Craig, Sudhir Kumar, S Blair Hedges","doi":"10.3389/fbinf.2023.1233281","DOIUrl":"10.3389/fbinf.2023.1233281","url":null,"abstract":"<p><p>The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10505794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41142624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets. 用于改进生物数据集 MDS 嵌入的正交离群点检测和维度估计。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1211819
Wanxin Li, Jules Mirone, Ashok Prasad, Nina Miolane, Carine Legrand, Khanh Dao Duc

Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.

传统的降维方法(如多维缩放(MDS))对正交离群值的存在很敏感,从而导致嵌入中的重大缺陷。我们介绍了一种稳健的 MDS 方法,称为 DeCOr-MDS(使用 MDS 检测和校正正交离群值),它基于数据点形成的简约的几何形状和统计数据,可以检测正交离群值,进而降低维度。我们利用合成数据集验证了我们的方法,并进一步展示了如何将其应用于各种大型真实生物数据集,包括癌症图像细胞数据、人类微生物组项目数据和单细胞 RNA 测序数据,以解决数据清理和可视化任务。
{"title":"Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets.","authors":"Wanxin Li, Jules Mirone, Ashok Prasad, Nina Miolane, Carine Legrand, Khanh Dao Duc","doi":"10.3389/fbinf.2023.1211819","DOIUrl":"10.3389/fbinf.2023.1211819","url":null,"abstract":"<p><p>Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called <i>DeCOr-MDS</i> (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10100807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular timetrees using relaxed clocks and uncertain phylogenies. 使用松弛时钟和不确定系统发育的分子时间树。
Pub Date : 2023-08-03 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1225807
Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar

A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.

分子系统学中的一种常见做法是推断系统发育,然后使用放松的时钟方法和校准将其按时间缩放。这种顺序分析实践忽略了系统发育不确定性对分歧时间估计及其置信区间的影响。另一种选择是联合推断系统发育和时间,将系统发育错误纳入分子年代测定中。我们使用计算机模拟和经验数据集比较了这两种替代方案在重建进化时间树方面的性能。我们发现,除了特定情况下的一些节点外,序列和联合分析可以产生相似的分化时间和系统发育关系。当系统发育没有很好地解决时,联合推理表现更好,在这种情况下,联合推理应该是首选的。然而,联合推理对于大型数据集可能是不可行的,因为可用的贝叶斯方法在计算上是繁重的。我们提出了一种联合推断的替代方法,该方法结合了少量自举、最大似然和RelTime方法,用于同时推断进化关系、分歧时间和置信区间,并结合了系统发育的不确定性。新方法减轻了贝叶斯方法带来的高计算负担,同时获得了类似的结果。
{"title":"Molecular timetrees using relaxed clocks and uncertain phylogenies.","authors":"Jose Barba-Montoya,&nbsp;Sudip Sharma,&nbsp;Sudhir Kumar","doi":"10.3389/fbinf.2023.1225807","DOIUrl":"10.3389/fbinf.2023.1225807","url":null,"abstract":"<p><p>A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XMR: an explainable multimodal neural network for drug response prediction. XMR:用于药物反应预测的可解释多模态神经网络。
Pub Date : 2023-08-02 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1164482
Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang

Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the "black box" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.

简介现有的大规模临床前癌症药物反应数据库为我们提供了一个发现和预测潜在有效抗癌药物的绝佳机会。建立在这些数据库上的深度学习模型已被开发并应用于解决癌症药物反应预测任务。事实证明,它们的预测效果明显优于传统的机器学习方法。然而,由于 "黑箱 "特性,这些深度学习模型很难得出忠实于生物学的解释。有人提出了依赖可见神经网络(VNN)的可解释深度学习模型,为预测结果提供生物学依据。然而,它们的性能并没有达到应用于临床实践的预期。方法:在本文中,我们开发了一种 XMR 模型,一种用于药物反应预测的可扩展多模态神经网络。XMR 是一种新的紧凑型多模态神经网络,由两个子网络组成:用于学习基因组特征的可见神经网络和用于学习药物结构特征的图神经网络(GNN)。这两个子网络被集成到一个多模态融合层中,为给定基因突变和药物分子结构的药物反应建模。此外,我们还采用了一种剪枝方法,以更好地解释 XMR 模型。我们使用从 Reactome 通路数据库中获取的五个通路层次(细胞周期、DNA 修复、疾病、信号转导和新陈代谢)作为 XMR 模型的 VNN 架构,以预测三阴性乳腺癌的药物反应。结果我们发现,我们的模型在预测性能方面优于其他最先进的可解释深度学习模型。此外,我们的模型还能为解释三阴性乳腺癌的药物反应提供生物学见解。讨论总的来说,XMR 在多模态融合层中结合了 VNN 和 GNN,捕捉到了关键的基因组和分子特征,并在生物学方面提供了合理的可解释性,从而更好地预测癌症患者的药物反应。我们的模型也将有益于未来的个性化癌症治疗。
{"title":"XMR: an explainable multimodal neural network for drug response prediction.","authors":"Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang","doi":"10.3389/fbinf.2023.1164482","DOIUrl":"10.3389/fbinf.2023.1164482","url":null,"abstract":"<p><p><b>Introduction:</b> Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the \"black box\" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. <b>Methods:</b> In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. <b>Results:</b> We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. <b>Discussion:</b> Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10039829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism. 在数据稀少的情况下,3'RNA-seq 优于标准 RNA-seq,但在确定模式生物的毒性通路方面,3'RNA-seq 则逊色于标准 RNA-seq。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-07-27 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1234218
Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay

Introduction: The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (Danio rerio), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. Methods and results: Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. Conclusions: These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.

引言RNA 测序技术的应用为研究复杂生物系统中的基因表达水平带来了诸多突破。其中包括了解生物体(如脊椎动物模式生物斑马鱼(Danio rerio))如何对毒物暴露做出反应。最近,3'RNA-seq 的发展使得基因表达水平的测定只需要标准 RNA-seq 所需的一小部分读数。虽然 3' RNA-seq 有很多优点,但在整个生物体毒性和数据稀少的情况下,还没有与标准 RNA-seq 进行过比较。方法与结果在此,我们用 3' RNA-seq 或标准 RNA-seq 对暴露于全氟丁烷磺酰胺(FBSA)的斑马鱼样本进行了研究,以确定两者在识别功能富集通路方面的优势。我们发现,3'RNA-seq 和标准 RNA-seq 在关注基因组中已注释或未注释的区域时表现出特定的优势。我们还发现,标准 RNA-seq 能鉴定出更多的差异表达基因 (DEG),但在数据稀少的情况下,这种优势就会消失。我们还发现,通过分析 DEG 列表,标准 RNA-seq 在识别功能富集通路方面具有显著优势,但通过对所有基因进行基因组富集分析来识别通路时,这种优势则微乎其微。结论:这些结果表明,每种方法都有可能在某些实验条件下发挥优势。我们的观察结果有助于指导其他人选择 3' RNA-seq 与标准 RNA 测序来查询一系列生物系统中的基因表达水平。
{"title":"3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism.","authors":"Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay","doi":"10.3389/fbinf.2023.1234218","DOIUrl":"10.3389/fbinf.2023.1234218","url":null,"abstract":"<p><p><b>Introduction:</b> The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (<i>Danio rerio</i>), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. <b>Methods and results:</b> Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. <b>Conclusions:</b> These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9990456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1