首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
The quantum hypercube as a k-mer graph. 作为 k-mer 图的量子超立方体。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-12 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1401223
Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan

The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.

自 20 世纪 80 年代以来,量子原理在计算中的应用一直备受关注。如今,这一概念不仅是理论上的,而且我们有办法设计和执行利用量子原理进行计算的技术。量子漫步搜索技术的出现体现了量子概念的实际应用及其彻底改变信息技术的潜力。量子漫步搜索技术用途广泛,可应用于各种问题。例如,量子漫步搜索可以在量子超立方体等组合搜索空间中识别标记项。量子超立方体将量子比特组织起来,量子比特状态代表顶点,边代表相差一个量子比特状态的状态转换。它为在量子领域表示 k-mer 图提供了一个新颖的框架。因此,量子超立方体有利于利用并行性,通过叠加和纠缠来搜索标记的 k-mer。然而,在对结果的分析中发现,这种搜索有时只能成功命中目标。因此,通过对量子行走搜索电路结果的细致检查、评估哪些输入-目标组合是有用的,以及对 DNA k-mer 搜索的富有远见的探索,本文打开了通向创新可能性的大门,为进一步的研究奠定了基础,以弥合量子计算理论猜想与生物信息学实际影响之间的差距。
{"title":"The quantum hypercube as a k-mer graph.","authors":"Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan","doi":"10.3389/fbinf.2024.1401223","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1401223","url":null,"abstract":"<p><p>The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11425167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual analysis of multi-omics data. 多组学数据的可视化分析
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-10 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1395981
Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp

We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different "visual channel" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.

我们介绍了一种多组学数据分析工具,它能在生物体尺度的代谢网络图上同时可视化多达四种类型的omics数据。该工具基于网络的交互式代谢图表描绘了单个生物体的代谢反应、途径和代谢物,如该生物体的代谢途径数据库所描述的那样。多组学可视化设施将每个单独的 omics 数据集绘制到代谢网络图的不同 "可视通道 "上。例如,转录组学数据集可以通过给代谢图中的反应箭头着色来显示,而配套的蛋白质组学数据集则显示为反应箭头的粗细,补充的代谢组学数据集则显示为代谢物节点的颜色。一旦用 omics 数据绘制了网络图,当用户放大时,语义缩放功能会在图中提供更多细节。包含多个时间点的数据集可以动画方式显示。该工具还能绘制用户指定的单个反应或代谢物的数据值。用户可以交互式调整数据值范围与显示颜色和厚度之间的映射关系,以提供信息更丰富的图表。
{"title":"Visual analysis of multi-omics data.","authors":"Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp","doi":"10.3389/fbinf.2024.1395981","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1395981","url":null,"abstract":"<p><p>We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different \"visual channel\" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review of model evaluation metrics for machine learning in genetics and genomics. 遗传学和基因组学中机器学习的模型评估指标综述。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-10 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1457619
Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan

Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.

机器学习(ML)在遗传学和基因组学领域大有可为,在这些领域,复杂的大型数据集有可能让人们深入了解疾病风险、遗传疾病的发病机理以及健康和福祉的预测等诸多方面。然而,有了这种可能性,就有责任谨慎行事,以防结果出现偏差和膨胀,造成意想不到的有害影响。因此,研究人员必须了解用于评估 ML 模型的指标,这些指标会影响对结果的批判性解释。在这篇综述中,我们概述了聚类、分类和回归的 ML 指标,并强调了每种指标的优缺点。我们还详细介绍了模型评估过程中常见的误区。最后,我们将举例说明研究人员如何评估和利用 ML 模型的结果,特别是从基因组学的角度进行评估和利用。
{"title":"A review of model evaluation metrics for machine learning in genetics and genomics.","authors":"Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan","doi":"10.3389/fbinf.2024.1457619","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1457619","url":null,"abstract":"<p><p>Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular docking and molecular dynamic simulation studies to identify potential terpenes against Internalin A protein of Listeria monocytogenes. 通过分子对接和分子动力学模拟研究,确定潜在的萜类化合物对单核细胞增生李斯特菌内毒素 A 蛋白的抗性。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-06 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1463750
Deepasree K, Subhashree Venugopal

Introduction: Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen Listeria monocytogenes, the lives of immunocompromised individuals have been at risk.

Objectives and methods: The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via in silico approaches.

Results: The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.

Discussion: Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.

导言:自从由可怕的李斯特菌病原体引起的李斯特菌病和其他相关疾病爆发以来,免疫力低下的人的生命就受到了威胁:本研究的主要目的是通过硅学方法了解萜类(一种主要的次生代谢物)在抑制病原体致病蛋白之一的内部蛋白 A (InlA) 方面的潜力:结果:根据分子对接研究,Bipinnatin 和 Epispongiadiol 的最佳结合亲和值为 -9.5 kcal/mol。对这些化合物进一步进行了 ADMET 和生物活性评估,结果证实它们具有良好的药代动力学特性和抗菌活性:讨论:以 100 ns 的时间尺度进行的分子动力学模拟最终揭示了表雄加二酚是一种很有前景的类药物化合物,有可能为该疾病的治疗铺平道路。
{"title":"Molecular docking and molecular dynamic simulation studies to identify potential terpenes against Internalin A protein of <i>Listeria monocytogenes</i>.","authors":"Deepasree K, Subhashree Venugopal","doi":"10.3389/fbinf.2024.1463750","DOIUrl":"10.3389/fbinf.2024.1463750","url":null,"abstract":"<p><strong>Introduction: </strong>Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen <i>Listeria monocytogenes</i>, the lives of immunocompromised individuals have been at risk.</p><p><strong>Objectives and methods: </strong>The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via <i>in silico</i> approaches.</p><p><strong>Results: </strong>The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.</p><p><strong>Discussion: </strong>Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhIP-Seq: methods, applications and challenges. PhIP-Seq:方法、应用和挑战。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-04 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1424202
Ziru Huang, Samarappuli Mudiyanselage Savini Gunarathne, Wenwen Liu, Yuwei Zhou, Yuqing Jiang, Shiqi Li, Jian Huang

Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.

噬菌体免疫沉淀测序(PhIP-Seq)技术是一种创新的高通量抗体检测方法。它能对单个抗体概况进行全面分析。这项技术显示出巨大的潜力,尤其是在探索疾病机制和免疫反应方面。目前,PhIP-Seq 已成功应用于多个领域,如探索自身免疫性疾病的生物标志物、疫苗开发和过敏原检测。各种生物信息学工具促进了这一过程的发展。然而,PhIP-Seq 技术仍然面临着许多挑战和改进空间。在此,我们回顾了 PhIP-Seq 的方法、应用和挑战,并讨论了其在免疫学研究和临床应用中的未来发展方向。随着技术的不断进步和优化,PhIP-Seq有望在未来的生物医学研究中发挥更加重要的作用,为疾病的预防、诊断和治疗提供新的思路和方法。
{"title":"PhIP-Seq: methods, applications and challenges.","authors":"Ziru Huang, Samarappuli Mudiyanselage Savini Gunarathne, Wenwen Liu, Yuwei Zhou, Yuqing Jiang, Shiqi Li, Jian Huang","doi":"10.3389/fbinf.2024.1424202","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1424202","url":null,"abstract":"<p><p>Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11408297/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rvisdiff: An R package for interactive visualization of differential expression. Rvisdiff:用于交互式可视化差异表达的 R 软件包。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-02 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1349205
David Barrios, Carlos Prieto

Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.

Rvisdiff 是一个 R/Bioconductor 软件包,可生成用于解释差异表达结果的交互式界面。它能创建一个本地网页,通过生成自动分析可视化效果来探索统计分析结果。用户可以在同一视图中交互式地探索差异表达结果和源表达数据。作为输入,该软件包支持 DESeq2、edgeR 和 limma 等流行的差异表达软件包的结果。作为输出,软件包会生成一个本地 HTML 页面,方便用户在网络浏览器中查看。Rvisdiff 可在 https://bioconductor.org/packages/Rvisdiff/ 免费获取。
{"title":"Rvisdiff: An R package for interactive visualization of differential expression.","authors":"David Barrios, Carlos Prieto","doi":"10.3389/fbinf.2024.1349205","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1349205","url":null,"abstract":"<p><p>Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11402892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rhizobium etli CFN42 and Sinorhizobium meliloti 1021 bioinformatic transcriptional regulatory networks from culture and symbiosis. 根瘤菌 CFN42 和瓜萎镰刀菌 1021 从培养和共生中获得生物信息转录调控网络。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-28 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1419274
Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Juan Miguel Escorcia-Rodríguez, Julio Augusto Freyre-González, Edgardo Galán-Vásquez, Sergio Encarnación-Guevara

Rhizobium etli CFN42 proteome-transcriptome mixed data of exponential growth and nitrogen-fixing bacteroids, as well as Sinorhizobium meliloti 1021 transcriptome data of growth and nitrogen-fixing bacteroids, were integrated into transcriptional regulatory networks (TRNs). The one-step construction network consisted of a matrix-clustering analysis of matrices of the gene profile and all matrices of the transcription factors (TFs) of their genome. The networks were constructed with the prediction of regulatory network application of the RhizoBindingSites database (http://rhizobindingsites.ccg.unam.mx/). The deduced free-living Rhizobium etli network contained 1,146 genes, including 380 TFs and 12 sigma factors. In addition, the bacteroid R. etli CFN42 network contained 884 genes, where 364 were TFs, and 12 were sigma factors, whereas the deduced free-living Sinorhizobium meliloti 1021 network contained 643 genes, where 259 were TFs and seven were sigma factors, and the bacteroid Sinorhizobium meliloti 1021 network contained 357 genes, where 210 were TFs and six were sigma factors. The similarity of these deduced condition-dependent networks and the biological E. coli and B. subtilis independent condition networks segregates from the random Erdös-Rényi networks. Deduced networks showed a low average clustering coefficient. They were not scale-free, showing a gradually diminishing hierarchy of TFs in contrast to the hierarchy role of the sigma factor rpoD in the E. coli K12 network. For rhizobia networks, partitioning the genome in the chromosome, chromids, and plasmids, where essential genes are distributed, and the symbiotic ability that is mostly coded in plasmids, may alter the structure of these deduced condition-dependent networks. It provides potential TF gen-target relationship data for constructing regulons, which are the basic units of a TRN.

将Rhizobium etli CFN42指数生长和固氮菌体的蛋白质组-转录组混合数据以及Sinorhizobium meliloti 1021生长和固氮菌体的转录组数据整合到转录调控网络(TRN)中。一步构建网络包括对基因图谱矩阵及其基因组中所有转录因子(TFs)矩阵进行矩阵聚类分析。这些网络是利用RhizoBindingSites数据库(http://rhizobindingsites.ccg.unam.mx/)的预测调控网络应用程序构建的。推导出的自由生活根瘤菌网络包含 1,146 个基因,其中包括 380 个 TF 和 12 个 sigma 因子。此外,R. etli CFN42菌体网络包含884个基因,其中364个为TFs,12个为sigma因子,而推导出的自由生活的瓜萎镰刀菌1021菌体网络包含643个基因,其中259个为TFs,7个为sigma因子,瓜萎镰刀菌1021菌体网络包含357个基因,其中210个为TFs,6个为sigma因子。这些推导出的依赖于条件的网络与生物大肠杆菌和枯草杆菌独立条件网络的相似性与随机的埃尔德斯-雷尼网络相分离。推导出的网络显示出较低的平均聚类系数。它们不是无标度的,显示出 TFs 逐渐减少的层次结构,这与大肠杆菌 K12 网络中 sigma 因子 rpoD 的层次结构作用形成鲜明对比。对于根瘤菌网络而言,将基因组划分为染色体、染色体和质粒(基本基因分布在染色体、染色体和质粒中),以及主要编码在质粒中的共生能力,可能会改变这些推导出的条件依赖性网络的结构。它为构建调控子(TRN 的基本单位)提供了潜在的 TF 基因-靶标关系数据。
{"title":"<i>Rhizobium etli</i> CFN42 and <i>Sinorhizobium meliloti</i> 1021 bioinformatic transcriptional regulatory networks from culture and symbiosis.","authors":"Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Juan Miguel Escorcia-Rodríguez, Julio Augusto Freyre-González, Edgardo Galán-Vásquez, Sergio Encarnación-Guevara","doi":"10.3389/fbinf.2024.1419274","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1419274","url":null,"abstract":"<p><p><i>Rhizobium etli</i> CFN42 proteome-transcriptome mixed data of exponential growth and nitrogen-fixing bacteroids, as well as <i>Sinorhizobium meliloti</i> 1021 transcriptome data of growth and nitrogen-fixing bacteroids, were integrated into transcriptional regulatory networks (TRNs). The one-step construction network consisted of a matrix-clustering analysis of matrices of the gene profile and all matrices of the transcription factors (TFs) of their genome. The networks were constructed with the prediction of regulatory network application of the RhizoBindingSites database (http://rhizobindingsites.ccg.unam.mx/). The deduced free-living <i>Rhizobium etli</i> network contained 1,146 genes, including 380 TFs and 12 sigma factors. In addition, the bacteroid <i>R. etli</i> CFN42 network contained 884 genes, where 364 were TFs, and 12 were sigma factors, whereas the deduced free-living <i>Sinorhizobium meliloti</i> 1021 network contained 643 genes, where 259 were TFs and seven were sigma factors, and the bacteroid <i>Sinorhizobium meliloti</i> 1021 network contained 357 genes, where 210 were TFs and six were sigma factors. The similarity of these deduced condition-dependent networks and the biological <i>E. coli</i> and <i>B. subtilis</i> independent condition networks segregates from the random Erdös-Rényi networks. Deduced networks showed a low average clustering coefficient. They were not scale-free, showing a gradually diminishing hierarchy of TFs in contrast to the hierarchy role of the sigma factor <i>rpoD</i> in the <i>E. coli</i> K12 network. For rhizobia networks, partitioning the genome in the chromosome, chromids, and plasmids, where essential genes are distributed, and the symbiotic ability that is mostly coded in plasmids, may alter the structure of these deduced condition-dependent networks. It provides potential TF gen-target relationship data for constructing regulons, which are the basic units of a TRN.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387232/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design principles for molecular animation. 分子动画的设计原则
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-21 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1353807
Stuart G Jantzen, Gaël McGill, Jodie Jenkinson

Molecular visualization is a powerful way to represent the complex structure of molecules and their higher order assemblies, as well as the dynamics of their interactions. Although conventions for depicting static molecular structures and complexes are now well established and guide the viewer's attention to specific aspects of structure and function, little attention and design classification has been devoted to how molecular motion is depicted. As we continue to probe and discover how molecules move - including their internal flexibility, conformational changes and dynamic associations with binding partners and environments - we are faced with difficult design challenges that are relevant to molecular visualizations both for the scientific community and students of cell and molecular biology. To facilitate these design decisions, we have identified twelve molecular animation design principles that are important to consider when creating molecular animations. Many of these principles pertain to misconceptions that students have primarily regarding the agency of molecules, while others are derived from visual treatments frequently observed in molecular animations that may promote misconceptions. For each principle, we have created a pair of molecular animations that exemplify the principle by depicting the same content in the presence and absence of that design approach. Although not intended to be prescriptive, we hope this set of design principles can be used by the scientific, education, and scientific visualization communities to facilitate and improve the pedagogical effectiveness of molecular animation.

分子可视化是表现分子复杂结构及其高阶组合以及分子相互作用动态的有力方法。尽管描绘静态分子结构和复合物的惯例现已确立,并能引导观众关注结构和功能的特定方面,但对于如何描绘分子运动却很少关注,也很少进行设计分类。随着我们不断探索和发现分子是如何运动的--包括其内部的灵活性、构象变化以及与结合伙伴和环境的动态关联--我们面临着艰巨的设计挑战,这些挑战对于科学界以及细胞和分子生物学专业的学生来说都与分子可视化相关。为了便于做出这些设计决定,我们确定了十二项分子动画设计原则,这些原则在制作分子动画时非常重要。其中许多原则主要涉及学生对分子机构的误解,而另一些原则则源自分子动画中经常出现的视觉处理方法,这些方法可能会助长学生的误解。针对每项原则,我们都制作了一对分子动画,通过在采用和不采用该设计方法的情况下描绘相同的内容来体现该原则。我们希望这套设计原则能够被科学界、教育界和科学可视化界使用,以促进和提高分子动画的教学效果。
{"title":"Design principles for molecular animation.","authors":"Stuart G Jantzen, Gaël McGill, Jodie Jenkinson","doi":"10.3389/fbinf.2024.1353807","DOIUrl":"10.3389/fbinf.2024.1353807","url":null,"abstract":"<p><p>Molecular visualization is a powerful way to represent the complex structure of molecules and their higher order assemblies, as well as the dynamics of their interactions. Although conventions for depicting static molecular structures and complexes are now well established and guide the viewer's attention to specific aspects of structure and function, little attention and design classification has been devoted to how molecular motion is depicted. As we continue to probe and discover how molecules move - including their internal flexibility, conformational changes and dynamic associations with binding partners and environments - we are faced with difficult design challenges that are relevant to molecular visualizations both for the scientific community and students of cell and molecular biology. To facilitate these design decisions, we have identified twelve molecular animation design principles that are important to consider when creating molecular animations. Many of these principles pertain to misconceptions that students have primarily regarding the agency of molecules, while others are derived from visual treatments frequently observed in molecular animations that may promote misconceptions. For each principle, we have created a pair of molecular animations that exemplify the principle by depicting the same content in the presence and absence of that design approach. Although not intended to be prescriptive, we hope this set of design principles can be used by the scientific, education, and scientific visualization communities to facilitate and improve the pedagogical effectiveness of molecular animation.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11371733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A layout framework for genome-wide multiple sequence alignment graphs. 全基因组多序列比对图的布局框架。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-16 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1358374
Jeremias Schebera, Dirk Zeckzer, Daniel Wiegreffe

Sequence alignments are often used to analyze genomic data. However, such alignments are often only calculated and compared on small sequence intervals for analysis purposes. When comparing longer sequences, these are usually divided into shorter sequence intervals for better alignment results. This usually means that the order context of the original sequence is lost. To prevent this, it is possible to use a graph structure to represent the order of the original sequence on the alignment blocks. The visualization of these graph structures can provide insights into the structural variations of genomes in a semi-global context. In this paper, we propose a new graph drawing framework for representing gMSA data. We produce a hierarchical graph layout that supports the comparative analysis of genomes. Based on a reference, the differences and similarities of the different genome orders are visualized. In this work, we present a complete graph drawing framework for gMSA graphs together with the respective algorithms for each of the steps. Additionally, we provide a prototype and an example data set for analyzing gMSA graphs. Based on this data set, we demonstrate the functionalities of the framework using two examples.

序列比对通常用于分析基因组数据。然而,出于分析目的,这类比对通常只在较小的序列间隔上进行计算和比较。在比较较长的序列时,为了获得更好的比对结果,通常会将这些序列分成较短的序列间隔。这通常意味着原始序列的顺序上下文会丢失。为了避免这种情况,可以使用图形结构来表示比对块上原始序列的顺序。这些图结构的可视化可以让人们在半全局的背景下深入了解基因组的结构变化。在本文中,我们提出了一种新的图形绘制框架,用于表示 gMSA 数据。我们制作的分层图布局可支持基因组的比较分析。在参考文献的基础上,不同基因组顺序的异同被可视化。在这项工作中,我们提出了一个完整的 gMSA 图绘制框架,以及每个步骤的相应算法。此外,我们还提供了分析 gMSA 图的原型和示例数据集。在此数据集的基础上,我们通过两个示例演示了该框架的功能。
{"title":"A layout framework for genome-wide multiple sequence alignment graphs.","authors":"Jeremias Schebera, Dirk Zeckzer, Daniel Wiegreffe","doi":"10.3389/fbinf.2024.1358374","DOIUrl":"10.3389/fbinf.2024.1358374","url":null,"abstract":"<p><p>Sequence alignments are often used to analyze genomic data. However, such alignments are often only calculated and compared on small sequence intervals for analysis purposes. When comparing longer sequences, these are usually divided into shorter sequence intervals for better alignment results. This usually means that the order context of the original sequence is lost. To prevent this, it is possible to use a graph structure to represent the order of the original sequence on the alignment blocks. The visualization of these graph structures can provide insights into the structural variations of genomes in a semi-global context. In this paper, we propose a new graph drawing framework for representing gMSA data. We produce a hierarchical graph layout that supports the comparative analysis of genomes. Based on a reference, the differences and similarities of the different genome orders are visualized. In this work, we present a complete graph drawing framework for gMSA graphs together with the respective algorithms for each of the steps. Additionally, we provide a prototype and an example data set for analyzing gMSA graphs. Based on this data set, we demonstrate the functionalities of the framework using two examples.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid approach for predicting transcription factors. 预测转录因子的混合方法。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-25 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1425419
Sumeet Patiyal, Palak Tiwari, Mohit Ghai, Aman Dhapola, Anjali Dhall, Gajendra P S Raghava

Transcription factors are essential DNA-binding proteins that regulate the transcription rate of several genes and control the expression of genes inside a cell. The prediction of transcription factors with high precision is important for understanding biological processes such as cell differentiation, intracellular signaling, and cell-cycle control. In this study, we developed a hybrid method that combines alignment-based and alignment-free methods for predicting transcription factors with higher accuracy. All models have been trained, tested, and evaluated on a large dataset that contains 19,406 transcription factors and 523,560 non-transcription factor protein sequences. To avoid biases in evaluation, the datasets were divided into training and validation/independent datasets, where 80% of the data was used for training, and the remaining 20% was used for external validation. In the case of alignment-free methods, models were developed using machine learning techniques and the composition-based features of a protein. Our best alignment-free model obtained an AUC of 0.97 on an independent dataset. In the case of the alignment-based method, we used BLAST at different cut-offs to predict the transcription factors. Although the alignment-based method demonstrated excellent performance, it was unable to cover all transcription factors due to instances of no hits. To combine the strengths of both methods, we developed a hybrid method that combines alignment-free and alignment-based methods. In the hybrid method, we added the scores of the alignment-free and alignment-based methods and achieved a maximum AUC of 0.99 on the independent dataset. The method proposed in this study performs better than existing methods. We incorporated the best models in the webserver/Python Package Index/standalone package of "TransFacPred" (https://webs.iiitd.edu.in/raghava/transfacpred).

转录因子是重要的 DNA 结合蛋白,可调节多个基因的转录速率,控制细胞内基因的表达。高精度预测转录因子对于了解细胞分化、细胞内信号转导和细胞周期控制等生物过程非常重要。在这项研究中,我们开发了一种混合方法,结合了基于配准和无配准的方法,以更高的精度预测转录因子。所有模型都在一个包含 19,406 个转录因子和 523,560 个非转录因子蛋白质序列的大型数据集上进行了训练、测试和评估。为避免评估中的偏差,数据集被分为训练数据集和验证/独立数据集,其中 80% 的数据用于训练,其余 20% 用于外部验证。在无配准方法中,使用机器学习技术和基于蛋白质组成的特征来开发模型。在一个独立数据集上,我们的最佳无配准模型获得了 0.97 的 AUC。在基于配准的方法中,我们使用不同截断值的 BLAST 来预测转录因子。虽然基于配准的方法表现出了卓越的性能,但由于存在无命中的情况,它无法覆盖所有转录因子。为了结合这两种方法的优势,我们开发了一种混合方法,将无配准和基于配准的方法结合起来。在混合方法中,我们将免配准方法和基于配准方法的得分相加,在独立数据集上取得了 0.99 的最大 AUC。本研究提出的方法比现有方法表现更好。我们将最佳模型纳入了 "TransFacPred"(https://webs.iiitd.edu.in/raghava/transfacpred)的网络服务器/Python软件包索引/独立软件包中。
{"title":"A hybrid approach for predicting transcription factors.","authors":"Sumeet Patiyal, Palak Tiwari, Mohit Ghai, Aman Dhapola, Anjali Dhall, Gajendra P S Raghava","doi":"10.3389/fbinf.2024.1425419","DOIUrl":"10.3389/fbinf.2024.1425419","url":null,"abstract":"<p><p>Transcription factors are essential DNA-binding proteins that regulate the transcription rate of several genes and control the expression of genes inside a cell. The prediction of transcription factors with high precision is important for understanding biological processes such as cell differentiation, intracellular signaling, and cell-cycle control. In this study, we developed a hybrid method that combines alignment-based and alignment-free methods for predicting transcription factors with higher accuracy. All models have been trained, tested, and evaluated on a large dataset that contains 19,406 transcription factors and 523,560 non-transcription factor protein sequences. To avoid biases in evaluation, the datasets were divided into training and validation/independent datasets, where 80% of the data was used for training, and the remaining 20% was used for external validation. In the case of alignment-free methods, models were developed using machine learning techniques and the composition-based features of a protein. Our best alignment-free model obtained an AUC of 0.97 on an independent dataset. In the case of the alignment-based method, we used BLAST at different cut-offs to predict the transcription factors. Although the alignment-based method demonstrated excellent performance, it was unable to cover all transcription factors due to instances of no hits. To combine the strengths of both methods, we developed a hybrid method that combines alignment-free and alignment-based methods. In the hybrid method, we added the scores of the alignment-free and alignment-based methods and achieved a maximum AUC of 0.99 on the independent dataset. The method proposed in this study performs better than existing methods. We incorporated the best models in the webserver/Python Package Index/standalone package of \"TransFacPred\" (https://webs.iiitd.edu.in/raghava/transfacpred).</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11306938/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141908534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1