Pub Date : 2024-09-23eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1441024
Rafał A Bachorz, Damian Nowak, Marcin Ratajewski
The drug design process can be successfully supported using a variety of in silico methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.
{"title":"QSPRmodeler - An open source application for molecular predictive analytics.","authors":"Rafał A Bachorz, Damian Nowak, Marcin Ratajewski","doi":"10.3389/fbinf.2024.1441024","DOIUrl":"10.3389/fbinf.2024.1441024","url":null,"abstract":"<p><p>The drug design process can be successfully supported using a variety of <i>in silico</i> methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1441024"},"PeriodicalIF":2.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.
{"title":"The quantum hypercube as a k-mer graph.","authors":"Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan","doi":"10.3389/fbinf.2024.1401223","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1401223","url":null,"abstract":"<p><p>The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1401223"},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11425167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1395981
Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp
We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different "visual channel" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.
{"title":"Visual analysis of multi-omics data.","authors":"Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp","doi":"10.3389/fbinf.2024.1395981","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1395981","url":null,"abstract":"<p><p>We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different \"visual channel\" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1395981"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1457619
Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan
Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.
机器学习(ML)在遗传学和基因组学领域大有可为,在这些领域,复杂的大型数据集有可能让人们深入了解疾病风险、遗传疾病的发病机理以及健康和福祉的预测等诸多方面。然而,有了这种可能性,就有责任谨慎行事,以防结果出现偏差和膨胀,造成意想不到的有害影响。因此,研究人员必须了解用于评估 ML 模型的指标,这些指标会影响对结果的批判性解释。在这篇综述中,我们概述了聚类、分类和回归的 ML 指标,并强调了每种指标的优缺点。我们还详细介绍了模型评估过程中常见的误区。最后,我们将举例说明研究人员如何评估和利用 ML 模型的结果,特别是从基因组学的角度进行评估和利用。
{"title":"A review of model evaluation metrics for machine learning in genetics and genomics.","authors":"Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan","doi":"10.3389/fbinf.2024.1457619","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1457619","url":null,"abstract":"<p><p>Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1457619"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1463750
Deepasree K, Subhashree Venugopal
Introduction: Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen Listeria monocytogenes, the lives of immunocompromised individuals have been at risk.
Objectives and methods: The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via in silico approaches.
Results: The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.
Discussion: Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.
{"title":"Molecular docking and molecular dynamic simulation studies to identify potential terpenes against Internalin A protein of <i>Listeria monocytogenes</i>.","authors":"Deepasree K, Subhashree Venugopal","doi":"10.3389/fbinf.2024.1463750","DOIUrl":"10.3389/fbinf.2024.1463750","url":null,"abstract":"<p><strong>Introduction: </strong>Ever since the outbreak of listeriosis and other related illnesses caused by the dreadful pathogen <i>Listeria monocytogenes</i>, the lives of immunocompromised individuals have been at risk.</p><p><strong>Objectives and methods: </strong>The main goal of this study is to comprehend the potential of terpenes, a major class of secondary metabolites in inhibiting one of the disease-causing protein Internalin A (InlA) of the pathogen via <i>in silico</i> approaches.</p><p><strong>Results: </strong>The best binding affinity value of -9.5 kcal/mol was observed for Bipinnatin and Epispongiadiol according to the molecular docking studies. The compounds were further subjected to ADMET and biological activity estimation which confirmed their good pharmacokinetic properties and antibacterial activity.</p><p><strong>Discussion: </strong>Molecular dynamic simulation for a timescale of 100 ns finally revealed Epispongiadiol to be a promising drug-like compound that could possibly pave the way to the treatment of this disease.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1463750"},"PeriodicalIF":2.8,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.
{"title":"PhIP-Seq: methods, applications and challenges.","authors":"Ziru Huang, Samarappuli Mudiyanselage Savini Gunarathne, Wenwen Liu, Yuwei Zhou, Yuqing Jiang, Shiqi Li, Jian Huang","doi":"10.3389/fbinf.2024.1424202","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1424202","url":null,"abstract":"<p><p>Phage-immunoprecipitation sequencing (PhIP-Seq) technology is an innovative, high-throughput antibody detection method. It enables comprehensive analysis of individual antibody profiles. This technology shows great potential, particularly in exploring disease mechanisms and immune responses. Currently, PhIP-Seq has been successfully applied in various fields, such as the exploration of biomarkers for autoimmune diseases, vaccine development, and allergen detection. A variety of bioinformatics tools have facilitated the development of this process. However, PhIP-Seq technology still faces many challenges and has room for improvement. Here, we review the methods, applications, and challenges of PhIP-Seq and discuss its future directions in immunological research and clinical applications. With continuous progress and optimization, PhIP-Seq is expected to play an even more important role in future biomedical research, providing new ideas and methods for disease prevention, diagnosis, and treatment.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1424202"},"PeriodicalIF":2.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11408297/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1349205
David Barrios, Carlos Prieto
Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.
{"title":"Rvisdiff: An R package for interactive visualization of differential expression.","authors":"David Barrios, Carlos Prieto","doi":"10.3389/fbinf.2024.1349205","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1349205","url":null,"abstract":"<p><p>Rvisdiff is an R/Bioconductor package that generates an interactive interface for the interpretation of differential expression results. It creates a local web page that enables the exploration of statistical analysis results through the generation of auto-analytical visualizations. Users can explore the differential expression results and the source expression data interactively in the same view. As input, the package supports the results of popular differential expression packages such as DESeq2, edgeR, and limma. As output, the package generates a local HTML page that can be easily viewed in a web browser. Rvisdiff is freely available at https://bioconductor.org/packages/Rvisdiff/.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1349205"},"PeriodicalIF":2.8,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11402892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-28eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1419274
Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Juan Miguel Escorcia-Rodríguez, Julio Augusto Freyre-González, Edgardo Galán-Vásquez, Sergio Encarnación-Guevara
Rhizobium etli CFN42 proteome-transcriptome mixed data of exponential growth and nitrogen-fixing bacteroids, as well as Sinorhizobium meliloti 1021 transcriptome data of growth and nitrogen-fixing bacteroids, were integrated into transcriptional regulatory networks (TRNs). The one-step construction network consisted of a matrix-clustering analysis of matrices of the gene profile and all matrices of the transcription factors (TFs) of their genome. The networks were constructed with the prediction of regulatory network application of the RhizoBindingSites database (http://rhizobindingsites.ccg.unam.mx/). The deduced free-living Rhizobium etli network contained 1,146 genes, including 380 TFs and 12 sigma factors. In addition, the bacteroid R. etli CFN42 network contained 884 genes, where 364 were TFs, and 12 were sigma factors, whereas the deduced free-living Sinorhizobium meliloti 1021 network contained 643 genes, where 259 were TFs and seven were sigma factors, and the bacteroid Sinorhizobium meliloti 1021 network contained 357 genes, where 210 were TFs and six were sigma factors. The similarity of these deduced condition-dependent networks and the biological E. coli and B. subtilis independent condition networks segregates from the random Erdös-Rényi networks. Deduced networks showed a low average clustering coefficient. They were not scale-free, showing a gradually diminishing hierarchy of TFs in contrast to the hierarchy role of the sigma factor rpoD in the E. coli K12 network. For rhizobia networks, partitioning the genome in the chromosome, chromids, and plasmids, where essential genes are distributed, and the symbiotic ability that is mostly coded in plasmids, may alter the structure of these deduced condition-dependent networks. It provides potential TF gen-target relationship data for constructing regulons, which are the basic units of a TRN.
{"title":"<i>Rhizobium etli</i> CFN42 and <i>Sinorhizobium meliloti</i> 1021 bioinformatic transcriptional regulatory networks from culture and symbiosis.","authors":"Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Juan Miguel Escorcia-Rodríguez, Julio Augusto Freyre-González, Edgardo Galán-Vásquez, Sergio Encarnación-Guevara","doi":"10.3389/fbinf.2024.1419274","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1419274","url":null,"abstract":"<p><p><i>Rhizobium etli</i> CFN42 proteome-transcriptome mixed data of exponential growth and nitrogen-fixing bacteroids, as well as <i>Sinorhizobium meliloti</i> 1021 transcriptome data of growth and nitrogen-fixing bacteroids, were integrated into transcriptional regulatory networks (TRNs). The one-step construction network consisted of a matrix-clustering analysis of matrices of the gene profile and all matrices of the transcription factors (TFs) of their genome. The networks were constructed with the prediction of regulatory network application of the RhizoBindingSites database (http://rhizobindingsites.ccg.unam.mx/). The deduced free-living <i>Rhizobium etli</i> network contained 1,146 genes, including 380 TFs and 12 sigma factors. In addition, the bacteroid <i>R. etli</i> CFN42 network contained 884 genes, where 364 were TFs, and 12 were sigma factors, whereas the deduced free-living <i>Sinorhizobium meliloti</i> 1021 network contained 643 genes, where 259 were TFs and seven were sigma factors, and the bacteroid <i>Sinorhizobium meliloti</i> 1021 network contained 357 genes, where 210 were TFs and six were sigma factors. The similarity of these deduced condition-dependent networks and the biological <i>E. coli</i> and <i>B. subtilis</i> independent condition networks segregates from the random Erdös-Rényi networks. Deduced networks showed a low average clustering coefficient. They were not scale-free, showing a gradually diminishing hierarchy of TFs in contrast to the hierarchy role of the sigma factor <i>rpoD</i> in the <i>E. coli</i> K12 network. For rhizobia networks, partitioning the genome in the chromosome, chromids, and plasmids, where essential genes are distributed, and the symbiotic ability that is mostly coded in plasmids, may alter the structure of these deduced condition-dependent networks. It provides potential TF gen-target relationship data for constructing regulons, which are the basic units of a TRN.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1419274"},"PeriodicalIF":2.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387232/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-21eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1353807
Stuart G Jantzen, Gaël McGill, Jodie Jenkinson
Molecular visualization is a powerful way to represent the complex structure of molecules and their higher order assemblies, as well as the dynamics of their interactions. Although conventions for depicting static molecular structures and complexes are now well established and guide the viewer's attention to specific aspects of structure and function, little attention and design classification has been devoted to how molecular motion is depicted. As we continue to probe and discover how molecules move - including their internal flexibility, conformational changes and dynamic associations with binding partners and environments - we are faced with difficult design challenges that are relevant to molecular visualizations both for the scientific community and students of cell and molecular biology. To facilitate these design decisions, we have identified twelve molecular animation design principles that are important to consider when creating molecular animations. Many of these principles pertain to misconceptions that students have primarily regarding the agency of molecules, while others are derived from visual treatments frequently observed in molecular animations that may promote misconceptions. For each principle, we have created a pair of molecular animations that exemplify the principle by depicting the same content in the presence and absence of that design approach. Although not intended to be prescriptive, we hope this set of design principles can be used by the scientific, education, and scientific visualization communities to facilitate and improve the pedagogical effectiveness of molecular animation.
{"title":"Design principles for molecular animation.","authors":"Stuart G Jantzen, Gaël McGill, Jodie Jenkinson","doi":"10.3389/fbinf.2024.1353807","DOIUrl":"10.3389/fbinf.2024.1353807","url":null,"abstract":"<p><p>Molecular visualization is a powerful way to represent the complex structure of molecules and their higher order assemblies, as well as the dynamics of their interactions. Although conventions for depicting static molecular structures and complexes are now well established and guide the viewer's attention to specific aspects of structure and function, little attention and design classification has been devoted to how molecular motion is depicted. As we continue to probe and discover how molecules move - including their internal flexibility, conformational changes and dynamic associations with binding partners and environments - we are faced with difficult design challenges that are relevant to molecular visualizations both for the scientific community and students of cell and molecular biology. To facilitate these design decisions, we have identified twelve molecular animation design principles that are important to consider when creating molecular animations. Many of these principles pertain to misconceptions that students have primarily regarding the agency of molecules, while others are derived from visual treatments frequently observed in molecular animations that may promote misconceptions. For each principle, we have created a pair of molecular animations that exemplify the principle by depicting the same content in the presence and absence of that design approach. Although not intended to be prescriptive, we hope this set of design principles can be used by the scientific, education, and scientific visualization communities to facilitate and improve the pedagogical effectiveness of molecular animation.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1353807"},"PeriodicalIF":2.8,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11371733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-16eCollection Date: 2024-01-01DOI: 10.3389/fbinf.2024.1358374
Jeremias Schebera, Dirk Zeckzer, Daniel Wiegreffe
Sequence alignments are often used to analyze genomic data. However, such alignments are often only calculated and compared on small sequence intervals for analysis purposes. When comparing longer sequences, these are usually divided into shorter sequence intervals for better alignment results. This usually means that the order context of the original sequence is lost. To prevent this, it is possible to use a graph structure to represent the order of the original sequence on the alignment blocks. The visualization of these graph structures can provide insights into the structural variations of genomes in a semi-global context. In this paper, we propose a new graph drawing framework for representing gMSA data. We produce a hierarchical graph layout that supports the comparative analysis of genomes. Based on a reference, the differences and similarities of the different genome orders are visualized. In this work, we present a complete graph drawing framework for gMSA graphs together with the respective algorithms for each of the steps. Additionally, we provide a prototype and an example data set for analyzing gMSA graphs. Based on this data set, we demonstrate the functionalities of the framework using two examples.
{"title":"A layout framework for genome-wide multiple sequence alignment graphs.","authors":"Jeremias Schebera, Dirk Zeckzer, Daniel Wiegreffe","doi":"10.3389/fbinf.2024.1358374","DOIUrl":"10.3389/fbinf.2024.1358374","url":null,"abstract":"<p><p>Sequence alignments are often used to analyze genomic data. However, such alignments are often only calculated and compared on small sequence intervals for analysis purposes. When comparing longer sequences, these are usually divided into shorter sequence intervals for better alignment results. This usually means that the order context of the original sequence is lost. To prevent this, it is possible to use a graph structure to represent the order of the original sequence on the alignment blocks. The visualization of these graph structures can provide insights into the structural variations of genomes in a semi-global context. In this paper, we propose a new graph drawing framework for representing gMSA data. We produce a hierarchical graph layout that supports the comparative analysis of genomes. Based on a reference, the differences and similarities of the different genome orders are visualized. In this work, we present a complete graph drawing framework for gMSA graphs together with the respective algorithms for each of the steps. Additionally, we provide a prototype and an example data set for analyzing gMSA graphs. Based on this data set, we demonstrate the functionalities of the framework using two examples.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1358374"},"PeriodicalIF":2.8,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}