首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
A time-calibrated phylogeny of the diversification of Holoadeninae frogs. 蛙类 Holoadeninae 多样化的时间校准系统发育。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-02 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1441373
Júlio C M Chaves, Fábio Hepp, Carlos G Schrago, Beatriz Mello

The phylogeny of the major lineages of Amphibia has received significant attention in recent years, although evolutionary relationships within families remain largely neglected. One such overlooked group is the subfamily Holoadeninae, comprising 73 species across nine genera and characterized by a disjunct geographical distribution. The lack of a fossil record for this subfamily hampers the formulation of a comprehensive evolutionary hypothesis for their diversification. Aiming to fill this gap, we inferred the phylogenetic relationships and divergence times for Holoadeninae using molecular data and calibration information derived from the fossil record of Neobatrachia. Our inferred phylogeny confirmed most genus-level associations, and molecular dating analysis placed the origin of Holoadeninae in the Eocene, with subsequent splits also occurring during this period. The climatic and geological events that occurred during the Oligocene-Miocene transition were crucial to the dynamic biogeographical history of the subfamily. However, the wide highest posterior density intervals in our divergence time estimates are primarily attributed to the absence of Holoadeninae fossil information and, secondarily, to the limited number of sampled nucleotide sites.

近年来,两栖动物主要品系的系统发育受到了极大关注,但科内的进化关系在很大程度上仍被忽视。Holoadeninae 亚科就是这样一个被忽视的亚科,该亚科由 9 个属 73 个种组成,地理分布不均。该亚科化石记录的缺乏阻碍了对其多样化提出一个全面的进化假说。为了填补这一空白,我们利用分子数据和来自新蝙蝠科化石记录的校准信息,推断了新蝙蝠科(Holoadeninae)的系统发生关系和分化时间。我们推断的系统发育证实了大多数属一级的联系,分子年代分析将 Holoadeninae 的起源定为始新世,随后的分裂也发生在这一时期。发生在渐新世-中新世过渡时期的气候和地质事件对该亚科的动态生物地理历史至关重要。然而,我们对分化时间估计的最高后验密度区间较宽,这主要是由于缺乏 Holoadeninae 的化石信息,其次是由于采样的核苷酸位点数量有限。
{"title":"A time-calibrated phylogeny of the diversification of Holoadeninae frogs.","authors":"Júlio C M Chaves, Fábio Hepp, Carlos G Schrago, Beatriz Mello","doi":"10.3389/fbinf.2024.1441373","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1441373","url":null,"abstract":"<p><p>The phylogeny of the major lineages of Amphibia has received significant attention in recent years, although evolutionary relationships within families remain largely neglected. One such overlooked group is the subfamily Holoadeninae, comprising 73 species across nine genera and characterized by a disjunct geographical distribution. The lack of a fossil record for this subfamily hampers the formulation of a comprehensive evolutionary hypothesis for their diversification. Aiming to fill this gap, we inferred the phylogenetic relationships and divergence times for Holoadeninae using molecular data and calibration information derived from the fossil record of Neobatrachia. Our inferred phylogeny confirmed most genus-level associations, and molecular dating analysis placed the origin of Holoadeninae in the Eocene, with subsequent splits also occurring during this period. The climatic and geological events that occurred during the Oligocene-Miocene transition were crucial to the dynamic biogeographical history of the subfamily. However, the wide highest posterior density intervals in our divergence time estimates are primarily attributed to the absence of Holoadeninae fossil information and, secondarily, to the limited number of sampled nucleotide sites.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1441373"},"PeriodicalIF":2.8,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480671/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SciJava Ops: an improved algorithms framework for Fiji and beyond. SciJava Ops:斐济及其他地区的改进算法框架。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-27 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1435733
Gabriel J Selzer, Curtis T Rueden, Mark C Hiner, Edward L Evans, David Kolb, Marcel Wiedenmann, Christian Birkhold, Tim-Oliver Buchholz, Stefan Helfrich, Brian Northan, Alison Walter, Johannes Schindelin, Tobias Pietzsch, Stephan Saalfeld, Michael R Berthold, Kevin W Eliceiri

Decades of iteration on scientific imaging hardware and software has yielded an explosion in not only the size, complexity, and heterogeneity of image datasets but also in the tooling used to analyze this data. This wealth of image analysis tools, spanning different programming languages, frameworks, and data structures, is itself a problem for data analysts who must adapt to new technologies and integrate established routines to solve increasingly complex problems. While many "bridge" layers exist to unify pairs of popular tools, there exists a need for a general solution to unify new and existing toolkits. The SciJava Ops library presented here addresses this need through two novel principles. Algorithm implementations are declared as plugins called Ops, providing a uniform interface regardless of the toolkit they came from. Users express their needs declaratively to the Op environment, which can then find and adapt available Ops on demand. By using these principles instead of direct function calls, users can write streamlined workflows while avoiding the translation boilerplate of bridge layers. Developers can easily extend SciJava Ops to introduce new libraries and more efficient, specialized algorithm implementations, even immediately benefitting existing workflows. We provide several use cases showing both user and developer benefits, as well as benchmarking data to quantify the negligible impact on overall analysis performance. We have initially deployed SciJava Ops on the Fiji platform, however it would be suitable for integration with additional analysis platforms in the future.

数十年来,科学成像硬件和软件的迭代不仅带来了图像数据集的规模、复杂性和异质性的激增,也带来了用于分析这些数据的工具的激增。丰富的图像分析工具涵盖了不同的编程语言、框架和数据结构,对于数据分析师来说,这本身就是一个问题,他们必须适应新技术并整合已有的例程,以解决日益复杂的问题。虽然有许多 "桥接 "层可以统一流行的工具对,但仍需要一个通用的解决方案来统一新的和现有的工具包。本文介绍的 SciJava Ops 库通过两个新颖的原则满足了这一需求。算法实现被声明为名为 Ops 的插件,无论它们来自哪个工具包,都能提供统一的接口。用户以声明的方式向 Op 环境表达他们的需求,Op 环境就能根据需求找到并调整可用的 Ops。通过使用这些原则而不是直接调用函数,用户可以编写精简的工作流程,同时避免桥接层的翻译模板。开发人员可以轻松扩展 SciJava Ops,以引入新的库和更高效、更专业的算法实现,甚至立即使现有的工作流程受益。我们提供了几个使用案例,展示了用户和开发人员的收益,并提供了基准数据,量化了对整体分析性能的微不足道的影响。我们最初在斐济平台上部署了 SciJava Ops,但它也适合在未来与其他分析平台集成。
{"title":"SciJava Ops: an improved algorithms framework for Fiji and beyond.","authors":"Gabriel J Selzer, Curtis T Rueden, Mark C Hiner, Edward L Evans, David Kolb, Marcel Wiedenmann, Christian Birkhold, Tim-Oliver Buchholz, Stefan Helfrich, Brian Northan, Alison Walter, Johannes Schindelin, Tobias Pietzsch, Stephan Saalfeld, Michael R Berthold, Kevin W Eliceiri","doi":"10.3389/fbinf.2024.1435733","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1435733","url":null,"abstract":"<p><p>Decades of iteration on scientific imaging hardware and software has yielded an explosion in not only the size, complexity, and heterogeneity of image datasets but also in the tooling used to analyze this data. This wealth of image analysis tools, spanning different programming languages, frameworks, and data structures, is itself a problem for data analysts who must adapt to new technologies and integrate established routines to solve increasingly complex problems. While many \"bridge\" layers exist to unify pairs of popular tools, there exists a need for a general solution to unify new and existing toolkits. The SciJava Ops library presented here addresses this need through two novel principles. Algorithm implementations are declared as plugins called Ops, providing a uniform interface regardless of the toolkit they came from. Users express their needs declaratively to the Op environment, which can then find and adapt available Ops on demand. By using these principles instead of direct function calls, users can write streamlined workflows while avoiding the translation boilerplate of bridge layers. Developers can easily extend SciJava Ops to introduce new libraries and more efficient, specialized algorithm implementations, even immediately benefitting existing workflows. We provide several use cases showing both user and developer benefits, as well as benchmarking data to quantify the negligible impact on overall analysis performance. We have initially deployed SciJava Ops on the Fiji platform, however it would be suitable for integration with additional analysis platforms in the future.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1435733"},"PeriodicalIF":2.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11466933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pangenome comparison via ED strings. 通过 ED 字符串进行泛基因组比较。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-26 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1397036
Esteban Gabory, Moses Njagi Mwaniki, Nadia Pisanti, Solon P Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba

Introduction: An elastic-degenerate (ED) string is a sequence of sets of strings. It can also be seen as a directed acyclic graph whose edges are labeled by strings. The notion of ED strings was introduced as a simple alternative to variation and sequence graphs for representing a pangenome, that is, a collection of genomic sequences to be analyzed jointly or to be used as a reference.

Methods: In this study, we define notions of matching statistics of two ED strings as similarity measures between pangenomes and, consequently infer a corresponding distance measure. We then show that both measures can be computed efficiently, in both theory and practice, by employing the intersection graph of two ED strings.

Results: We also implemented our methods as a software tool for pangenome comparison and evaluated their efficiency and effectiveness using both synthetic and real datasets.

Discussion: As for efficiency, we compare the runtime of the intersection graph method against the classic product automaton construction showing that the intersection graph is faster by up to one order of magnitude. For showing effectiveness, we used real SARS-CoV-2 datasets and our matching statistics similarity measure to reproduce a well-established clade classification of SARS-CoV-2, thus demonstrating that the classification obtained by our method is in accordance with the existing one.

引言弹性退化(ED)字符串是一组字符串的序列。它也可以看作是一个有向无环图,其边缘用字符串标记。ED 字符串的概念是作为变异图和序列图的一种简单替代方法而提出的,用于表示庞基因组,即需要联合分析或用作参考的基因组序列集合:在这项研究中,我们定义了两个 ED 字符串的匹配统计量概念,将其作为庞基因组之间的相似性度量,并由此推断出相应的距离度量。然后,我们证明了通过使用两个 ED 字符串的交集图,可以在理论和实践中高效计算这两个度量:结果:我们还将我们的方法作为一种软件工具进行了庞基因组比较,并使用合成数据集和真实数据集评估了这些方法的效率和有效性:在效率方面,我们将交集图方法的运行时间与经典的乘积自动机构造进行了比较,结果显示交集图的速度快达一个数量级。在有效性方面,我们使用真实的 SARS-CoV-2 数据集和我们的匹配统计相似性度量重现了 SARS-CoV-2 的一个成熟的支系分类,从而证明我们的方法所获得的分类与现有的分类是一致的。
{"title":"Pangenome comparison via ED strings.","authors":"Esteban Gabory, Moses Njagi Mwaniki, Nadia Pisanti, Solon P Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba","doi":"10.3389/fbinf.2024.1397036","DOIUrl":"10.3389/fbinf.2024.1397036","url":null,"abstract":"<p><strong>Introduction: </strong>An elastic-degenerate (ED) string is a sequence of sets of strings. It can also be seen as a directed acyclic graph whose edges are labeled by strings. The notion of ED strings was introduced as a simple alternative to variation and sequence graphs for representing a pangenome, that is, a collection of genomic sequences to be analyzed jointly or to be used as a reference.</p><p><strong>Methods: </strong>In this study, we define notions of <i>matching statistics</i> of two ED strings as similarity measures between pangenomes and, consequently infer a corresponding distance measure. We then show that both measures can be computed efficiently, in both theory and practice, by employing the <i>intersection graph</i> of two ED strings.</p><p><strong>Results: </strong>We also implemented our methods as a software tool for pangenome comparison and evaluated their efficiency and effectiveness using both synthetic and real datasets.</p><p><strong>Discussion: </strong>As for efficiency, we compare the runtime of the intersection graph method against the classic product automaton construction showing that the intersection graph is faster by up to one order of magnitude. For showing effectiveness, we used real SARS-CoV-2 datasets and our matching statistics similarity measure to reproduce a well-established clade classification of SARS-CoV-2, thus demonstrating that the classification obtained by our method is in accordance with the existing one.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1397036"},"PeriodicalIF":2.8,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QSPRmodeler - An open source application for molecular predictive analytics. QSPRmodeler - 用于分子预测分析的开源应用程序。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-23 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1441024
Rafał A Bachorz, Damian Nowak, Marcin Ratajewski

The drug design process can be successfully supported using a variety of in silico methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.

药物设计过程可以成功地利用各种硅学方法来支持。其中一些方法面向分子特性预测,这是药物发现早期阶段的关键步骤。在实验验证之前,候选药物通常要与已知实验数据进行比较。从技术上讲,这可以通过机器学习方法来实现,即使用选定的实验数据来训练预测模型。所提出的 Python 软件就是为此目的而设计的。它支持分子数据处理的整个工作流程,从原始数据准备到分子描述符创建和机器学习模型训练。由此产生的模型的预测能力经过了内部和外部的仔细验证。这些模型可轻松应用于新化合物,包括涉及生成方法的更复杂的工作流程。
{"title":"QSPRmodeler - An open source application for molecular predictive analytics.","authors":"Rafał A Bachorz, Damian Nowak, Marcin Ratajewski","doi":"10.3389/fbinf.2024.1441024","DOIUrl":"10.3389/fbinf.2024.1441024","url":null,"abstract":"<p><p>The drug design process can be successfully supported using a variety of <i>in silico</i> methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1441024"},"PeriodicalIF":2.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The quantum hypercube as a k-mer graph. 作为 k-mer 图的量子超立方体。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-12 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1401223
Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan

The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.

自 20 世纪 80 年代以来,量子原理在计算中的应用一直备受关注。如今,这一概念不仅是理论上的,而且我们有办法设计和执行利用量子原理进行计算的技术。量子漫步搜索技术的出现体现了量子概念的实际应用及其彻底改变信息技术的潜力。量子漫步搜索技术用途广泛,可应用于各种问题。例如,量子漫步搜索可以在量子超立方体等组合搜索空间中识别标记项。量子超立方体将量子比特组织起来,量子比特状态代表顶点,边代表相差一个量子比特状态的状态转换。它为在量子领域表示 k-mer 图提供了一个新颖的框架。因此,量子超立方体有利于利用并行性,通过叠加和纠缠来搜索标记的 k-mer。然而,在对结果的分析中发现,这种搜索有时只能成功命中目标。因此,通过对量子行走搜索电路结果的细致检查、评估哪些输入-目标组合是有用的,以及对 DNA k-mer 搜索的富有远见的探索,本文打开了通向创新可能性的大门,为进一步的研究奠定了基础,以弥合量子计算理论猜想与生物信息学实际影响之间的差距。
{"title":"The quantum hypercube as a k-mer graph.","authors":"Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan","doi":"10.3389/fbinf.2024.1401223","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1401223","url":null,"abstract":"<p><p>The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1401223"},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11425167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual analysis of multi-omics data. 多组学数据的可视化分析
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-10 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1395981
Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp

We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different "visual channel" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.

我们介绍了一种多组学数据分析工具,它能在生物体尺度的代谢网络图上同时可视化多达四种类型的omics数据。该工具基于网络的交互式代谢图表描绘了单个生物体的代谢反应、途径和代谢物,如该生物体的代谢途径数据库所描述的那样。多组学可视化设施将每个单独的 omics 数据集绘制到代谢网络图的不同 "可视通道 "上。例如,转录组学数据集可以通过给代谢图中的反应箭头着色来显示,而配套的蛋白质组学数据集则显示为反应箭头的粗细,补充的代谢组学数据集则显示为代谢物节点的颜色。一旦用 omics 数据绘制了网络图,当用户放大时,语义缩放功能会在图中提供更多细节。包含多个时间点的数据集可以动画方式显示。该工具还能绘制用户指定的单个反应或代谢物的数据值。用户可以交互式调整数据值范围与显示颜色和厚度之间的映射关系,以提供信息更丰富的图表。
{"title":"Visual analysis of multi-omics data.","authors":"Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp","doi":"10.3389/fbinf.2024.1395981","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1395981","url":null,"abstract":"<p><p>We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different \"visual channel\" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1395981"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review of model evaluation metrics for machine learning in genetics and genomics. 遗传学和基因组学中机器学习的模型评估指标综述。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-10 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1457619
Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan

Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.

机器学习(ML)在遗传学和基因组学领域大有可为,在这些领域,复杂的大型数据集有可能让人们深入了解疾病风险、遗传疾病的发病机理以及健康和福祉的预测等诸多方面。然而,有了这种可能性,就有责任谨慎行事,以防结果出现偏差和膨胀,造成意想不到的有害影响。因此,研究人员必须了解用于评估 ML 模型的指标,这些指标会影响对结果的批判性解释。在这篇综述中,我们概述了聚类、分类和回归的 ML 指标,并强调了每种指标的优缺点。我们还详细介绍了模型评估过程中常见的误区。最后,我们将举例说明研究人员如何评估和利用 ML 模型的结果,特别是从基因组学的角度进行评估和利用。
{"title":"A review of model evaluation metrics for machine learning in genetics and genomics.","authors":"Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan","doi":"10.3389/fbinf.2024.1457619","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1457619","url":null,"abstract":"<p><p>Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1457619"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular docking and molecular dynamic simulation studies to identify potential terpenes against Internalin A protein of Listeria monocytogenes. 通过分子对接和分子动力学模拟研究,确定潜在的萜类化合物对单核细胞增生李斯特菌内毒素 A 蛋白的抗性。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-06 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1463750
Deepasree K, Subhashree Venugopal

Introduction: Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen Listeria monocytogenes, the lives of immunocompromised individuals have been at risk.

Objectives and methods: The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via in silico approaches.

Results: The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.

Discussion: Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.

导言:自从由可怕的李斯特菌病原体引起的李斯特菌病和其他相关疾病爆发以来,免疫力低下的人的生命就受到了威胁:本研究的主要目的是通过硅学方法了解萜类(一种主要的次生代谢物)在抑制病原体致病蛋白之一的内部蛋白 A (InlA) 方面的潜力:结果:根据分子对接研究,Bipinnatin 和 Epispongiadiol 的最佳结合亲和值为 -9.5 kcal/mol。对这些化合物进一步进行了 ADMET 和生物活性评估,结果证实它们具有良好的药代动力学特性和抗菌活性:讨论:以 100 ns 的时间尺度进行的分子动力学模拟最终揭示了表雄加二酚是一种很有前景的类药物化合物,有可能为该疾病的治疗铺平道路。
{"title":"Molecular docking and molecular dynamic simulation studies to identify potential terpenes against Internalin A protein of <i>Listeria monocytogenes</i>.","authors":"Deepasree K, Subhashree Venugopal","doi":"10.3389/fbinf.2024.1463750","DOIUrl":"10.3389/fbinf.2024.1463750","url":null,"abstract":"<p><strong>Introduction: </strong>Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen <i>Listeria monocytogenes</i>, the lives of immunocompromised individuals have been at risk.</p><p><strong>Objectives and methods: </strong>The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via <i>in silico</i> approaches.</p><p><strong>Results: </strong>The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.</p><p><strong>Discussion: </strong>Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1463750"},"PeriodicalIF":2.8,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhIP-Seq: methods, applications and challenges. PhIP-Seq:方法、应用和挑战。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-04 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1424202
Ziru Huang, Samarappuli Mudiyanselage Savini Gunarathne, Wenwen Liu, Yuwei Zhou, Yuqing Jiang, Shiqi Li, Jian Huang

Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.

噬菌体免疫沉淀测序(PhIP-Seq)技术是一种创新的高通量抗体检测方法。它能对单个抗体概况进行全面分析。这项技术显示出巨大的潜力,尤其是在探索疾病机制和免疫反应方面。目前,PhIP-Seq 已成功应用于多个领域,如探索自身免疫性疾病的生物标志物、疫苗开发和过敏原检测。各种生物信息学工具促进了这一过程的发展。然而,PhIP-Seq 技术仍然面临着许多挑战和改进空间。在此,我们回顾了 PhIP-Seq 的方法、应用和挑战,并讨论了其在免疫学研究和临床应用中的未来发展方向。随着技术的不断进步和优化,PhIP-Seq有望在未来的生物医学研究中发挥更加重要的作用,为疾病的预防、诊断和治疗提供新的思路和方法。
{"title":"PhIP-Seq: methods, applications and challenges.","authors":"Ziru Huang, Samarappuli Mudiyanselage Savini Gunarathne, Wenwen Liu, Yuwei Zhou, Yuqing Jiang, Shiqi Li, Jian Huang","doi":"10.3389/fbinf.2024.1424202","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1424202","url":null,"abstract":"<p><p>Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1424202"},"PeriodicalIF":2.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11408297/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rvisdiff: An R package for interactive visualization of differential expression. Rvisdiff:用于交互式可视化差异表达的 R 软件包。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-02 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1349205
David Barrios, Carlos Prieto

Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.

Rvisdiff 是一个 R/Bioconductor 软件包,可生成用于解释差异表达结果的交互式界面。它能创建一个本地网页,通过生成自动分析可视化效果来探索统计分析结果。用户可以在同一视图中交互式地探索差异表达结果和源表达数据。作为输入,该软件包支持 DESeq2、edgeR 和 limma 等流行的差异表达软件包的结果。作为输出,软件包会生成一个本地 HTML 页面,方便用户在网络浏览器中查看。Rvisdiff 可在 https://bioconductor.org/packages/Rvisdiff/ 免费获取。
{"title":"Rvisdiff: An R package for interactive visualization of differential expression.","authors":"David Barrios, Carlos Prieto","doi":"10.3389/fbinf.2024.1349205","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1349205","url":null,"abstract":"<p><p>Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1349205"},"PeriodicalIF":2.8,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11402892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1