Pub Date : 2024-07-25eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae105
Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert R Jenq, Christine B Peterson
Summary: Advances in survival analysis have facilitated unprecedented flexibility in data modeling, yet there remains a lack of tools for illustrating the influence of continuous covariates on predicted survival outcomes. We propose the utilization of a colored contour plot to depict the predicted survival probabilities over time. Our approach is capable of supporting conventional models, including the Cox and Fine-Gray models. However, its capability shines when coupled with cutting-edge machine learning models such as random survival forests and deep neural networks.
Availability and implementation: We provide a Shiny app at https://biostatistics.mdanderson.org/shinyapps/survivalContour/ and an R package available at https://github.com/YushuShi/survivalContour as implementations of this tool.
{"title":"survivalContour: visualizing predicted survival via colored contour plots.","authors":"Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert R Jenq, Christine B Peterson","doi":"10.1093/bioadv/vbae105","DOIUrl":"10.1093/bioadv/vbae105","url":null,"abstract":"<p><strong>Summary: </strong>Advances in survival analysis have facilitated unprecedented flexibility in data modeling, yet there remains a lack of tools for illustrating the influence of continuous covariates on predicted survival outcomes. We propose the utilization of a colored contour plot to depict the predicted survival probabilities over time. Our approach is capable of supporting conventional models, including the Cox and Fine-Gray models. However, its capability shines when coupled with cutting-edge machine learning models such as random survival forests and deep neural networks.</p><p><strong>Availability and implementation: </strong>We provide a Shiny app at https://biostatistics.mdanderson.org/shinyapps/survivalContour/ and an R package available at https://github.com/YushuShi/survivalContour as implementations of this tool.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae105"},"PeriodicalIF":2.4,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11290613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141861796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-23eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae107
Saul Pierotti, Bettina Welz, Mireia Osuna-López, Tomas Fitzgerald, Joachim Wittbrodt, Ewan Birney
Motivation: Crosses among inbred lines are a fundamental tool for the discovery of genetic loci associated with phenotypes of interest. In organisms for which large reference panels or SNP chips are not available, imputation from low-pass whole-genome sequencing is an effective method for obtaining genotype data from a large number of individuals. To date, a structured analysis of the conditions required for optimal genotype imputation has not been performed.
Results: We report a systematic exploration of the effect of several design variables on imputation performance in F2 crosses of inbred medaka lines using the imputation software STITCH. We determined that, depending on the number of samples, imputation performance reaches a plateau when increasing the per-sample sequencing coverage. We also systematically explored the trade-offs between cost, imputation accuracy, and sample numbers. We developed a computational pipeline to streamline the process, enabling other researchers to perform a similar cost-benefit analysis on their population of interest.
Availability and implementation: The source code for the pipeline is available at https://github.com/birneylab/stitchimpute. While our pipeline has been developed and tested for an F2 population, the software can also be used to analyse populations with a different structure.
{"title":"Genotype imputation in F2 crosses of inbred lines.","authors":"Saul Pierotti, Bettina Welz, Mireia Osuna-López, Tomas Fitzgerald, Joachim Wittbrodt, Ewan Birney","doi":"10.1093/bioadv/vbae107","DOIUrl":"10.1093/bioadv/vbae107","url":null,"abstract":"<p><strong>Motivation: </strong>Crosses among inbred lines are a fundamental tool for the discovery of genetic loci associated with phenotypes of interest. In organisms for which large reference panels or SNP chips are not available, imputation from low-pass whole-genome sequencing is an effective method for obtaining genotype data from a large number of individuals. To date, a structured analysis of the conditions required for optimal genotype imputation has not been performed.</p><p><strong>Results: </strong>We report a systematic exploration of the effect of several design variables on imputation performance in F2 crosses of inbred medaka lines using the imputation software STITCH. We determined that, depending on the number of samples, imputation performance reaches a plateau when increasing the per-sample sequencing coverage. We also systematically explored the trade-offs between cost, imputation accuracy, and sample numbers. We developed a computational pipeline to streamline the process, enabling other researchers to perform a similar cost-benefit analysis on their population of interest.</p><p><strong>Availability and implementation: </strong>The source code for the pipeline is available at https://github.com/birneylab/stitchimpute. While our pipeline has been developed and tested for an F2 population, the software can also be used to analyse populations with a different structure.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae107"},"PeriodicalIF":2.4,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11286293/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141794100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae106
Jehad Aldahdooh, Ziaurrehman Tanoli, Jing Tang
Motivation: Drug-target interactions (DTIs) play a pivotal role in drug discovery, as it aims to identify potential drug targets and elucidate their mechanism of action. In recent years, the application of natural language processing (NLP), particularly when combined with pre-trained language models, has gained considerable momentum in the biomedical domain, with the potential to mine vast amounts of texts to facilitate the efficient extraction of DTIs from the literature.
Results: In this article, we approach the task of DTIs as an entity-relationship extraction problem, utilizing different pre-trained transformer language models, such as BERT, to extract DTIs. Our results indicate that an ensemble approach, by combining gene descriptions from the Entrez Gene database with chemical descriptions from the Comparative Toxicogenomics Database (CTD), is critical for achieving optimal performance. The proposed model achieves an F1 score of 80.6 on the hidden DrugProt test set, which is the top-ranked performance among all the submitted models in the official evaluation. Furthermore, we conduct a comparative analysis to evaluate the effectiveness of various gene textual descriptions sourced from Entrez Gene and UniProt databases to gain insights into their impact on the performance. Our findings highlight the potential of NLP-based text mining using gene and chemical descriptions to improve drug-target extraction tasks.
Availability and implementation: Datasets utilized in this study are accessible at https://dtis.drugtargetcommons.org/.
{"title":"Mining drug-target interactions from biomedical literature using chemical and gene descriptions-based ensemble transformer model.","authors":"Jehad Aldahdooh, Ziaurrehman Tanoli, Jing Tang","doi":"10.1093/bioadv/vbae106","DOIUrl":"10.1093/bioadv/vbae106","url":null,"abstract":"<p><strong>Motivation: </strong>Drug-target interactions (DTIs) play a pivotal role in drug discovery, as it aims to identify potential drug targets and elucidate their mechanism of action. In recent years, the application of natural language processing (NLP), particularly when combined with pre-trained language models, has gained considerable momentum in the biomedical domain, with the potential to mine vast amounts of texts to facilitate the efficient extraction of DTIs from the literature.</p><p><strong>Results: </strong>In this article, we approach the task of DTIs as an entity-relationship extraction problem, utilizing different pre-trained transformer language models, such as BERT, to extract DTIs. Our results indicate that an ensemble approach, by combining gene descriptions from the Entrez Gene database with chemical descriptions from the Comparative Toxicogenomics Database (CTD), is critical for achieving optimal performance. The proposed model achieves an <i>F</i>1 score of 80.6 on the hidden DrugProt test set, which is the top-ranked performance among all the submitted models in the official evaluation. Furthermore, we conduct a comparative analysis to evaluate the effectiveness of various gene textual descriptions sourced from Entrez Gene and UniProt databases to gain insights into their impact on the performance. Our findings highlight the potential of NLP-based text mining using gene and chemical descriptions to improve drug-target extraction tasks.</p><p><strong>Availability and implementation: </strong>Datasets utilized in this study are accessible at https://dtis.drugtargetcommons.org/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae106"},"PeriodicalIF":2.4,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293871/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae091
[This corrects the article DOI: 10.1093/bioadv/vbae019.].
[此处更正了文章 DOI:10.1093/bioadv/vbae019]。
{"title":"Correction to: Quantitative transcriptomic and epigenomic data analysis: a primer.","authors":"","doi":"10.1093/bioadv/vbae091","DOIUrl":"https://doi.org/10.1093/bioadv/vbae091","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/bioadv/vbae019.].</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae091"},"PeriodicalIF":2.4,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae104
Michael Predl, Kilian Gandolf, Michael Hofer, Thomas Rattei
Motivation: Genome-scale community metabolic models are used to gain mechanistic insights into interactions between community members. However, existing tools for visualizing metabolic models only cater to the needs of single organism models.
Results: ScyNet is a Cytoscape app for visualizing community metabolic models, generating networks with reduced complexity by focusing on interactions between community members. ScyNet can incorporate the state of a metabolic model via fluxes or flux ranges, which is shown in a previously published simplified cystic fibrosis airway community model.
Availability and implementation: ScyNet is freely available under an MIT licence and can be retrieved via the Cytoscape App Store (apps.cytoscape.org/apps/scynet). The source code is available at Github (github.com/univieCUBE/ScyNet).
{"title":"ScyNet: Visualizing interactions in community metabolic models.","authors":"Michael Predl, Kilian Gandolf, Michael Hofer, Thomas Rattei","doi":"10.1093/bioadv/vbae104","DOIUrl":"10.1093/bioadv/vbae104","url":null,"abstract":"<p><strong>Motivation: </strong>Genome-scale community metabolic models are used to gain mechanistic insights into interactions between community members. However, existing tools for visualizing metabolic models only cater to the needs of single organism models.</p><p><strong>Results: </strong>ScyNet is a Cytoscape app for visualizing community metabolic models, generating networks with reduced complexity by focusing on interactions between community members. ScyNet can incorporate the state of a metabolic model via fluxes or flux ranges, which is shown in a previously published simplified cystic fibrosis airway community model.</p><p><strong>Availability and implementation: </strong>ScyNet is freely available under an MIT licence and can be retrieved via the Cytoscape App Store (apps.cytoscape.org/apps/scynet). The source code is available at Github (github.com/univieCUBE/ScyNet).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae104"},"PeriodicalIF":2.4,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11315608/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141918224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae097
Aryo Pradipta Gema, Dominik Grabarczyk, Wolf De Wulf, Piyush Borole, Javier Antonio Alfaro, Pasquale Minervini, Antonio Vergari, Ajitha Rajan
Summary: Knowledge graphs (KGs) are powerful tools for representing and organizing complex biomedical data. They empower researchers, physicians, and scientists by facilitating rapid access to biomedical information, enabling the discernment of patterns or insights, and fostering the formulation of decisions and the generation of novel knowledge. To automate these activities, several KG embedding algorithms have been proposed to learn from and complete KGs. However, the efficacy of these embedding algorithms appears limited when applied to biomedical KGs, prompting questions about whether they can be useful in this field. To that end, we explore several widely used KG embedding models and evaluate their performance and applications using a recent biomedical KG, BioKG. We also demonstrate that by using recent best practices for training KG embeddings, it is possible to improve performance over BioKG. Additionally, we address interpretability concerns that naturally arise with such machine learning methods. In particular, we examine rule-based methods that aim to address these concerns by making interpretable predictions using learned rules, achieving comparable performance. Finally, we discuss a realistic use case where a pretrained BioKG embedding is further trained for a specific task, in this case, four polypharmacy scenarios where the goal is to predict missing links or entities in another downstream KGs in four polypharmacy scenarios. We conclude that in the right scenarios, biomedical KG embeddings can be effective and useful.
Availability and implementation: Our code and data is available at https://github.com/aryopg/biokge.
摘要:知识图谱(KG)是表示和组织复杂生物医学数据的强大工具。知识图谱能帮助研究人员、医生和科学家快速获取生物医学信息,辨别模式或见解,促进决策的制定和新知识的产生。为了实现这些活动的自动化,人们提出了几种 KG 嵌入算法来学习和完成 KG。然而,当这些嵌入算法应用于生物医学 KG 时,其功效似乎有限,从而引发了这些算法在这一领域是否有用的问题。为此,我们探索了几种广泛使用的 KG 嵌入模型,并使用最新的生物医学 KG BioKG 评估了它们的性能和应用。我们还证明,通过使用最新的最佳实践来训练 KG 嵌入,可以提高 BioKG 的性能。此外,我们还解决了此类机器学习方法自然产生的可解释性问题。特别是,我们研究了基于规则的方法,这些方法旨在通过使用学习到的规则进行可解释的预测来解决这些问题,从而实现可比较的性能。最后,我们讨论了一个现实的使用案例,即针对特定任务进一步训练预训练的 BioKG 嵌入,在本案例中,我们讨论了四个多药场景,目标是预测四个多药场景中另一个下游 KG 中缺失的链接或实体。我们的结论是,在正确的场景中,生物医学 KG 嵌入是有效和有用的:我们的代码和数据可在 https://github.com/aryopg/biokge 上获取。
{"title":"Knowledge graph embeddings in the biomedical domain: are they useful? A look at link prediction, rule learning, and downstream polypharmacy tasks.","authors":"Aryo Pradipta Gema, Dominik Grabarczyk, Wolf De Wulf, Piyush Borole, Javier Antonio Alfaro, Pasquale Minervini, Antonio Vergari, Ajitha Rajan","doi":"10.1093/bioadv/vbae097","DOIUrl":"10.1093/bioadv/vbae097","url":null,"abstract":"<p><strong>Summary: </strong>Knowledge graphs (KGs) are powerful tools for representing and organizing complex biomedical data. They empower researchers, physicians, and scientists by facilitating rapid access to biomedical information, enabling the discernment of patterns or insights, and fostering the formulation of decisions and the generation of novel knowledge. To automate these activities, several KG embedding algorithms have been proposed to learn from and complete KGs. However, the efficacy of these embedding algorithms appears limited when applied to biomedical KGs, prompting questions about whether they can be useful in this field. To that end, we explore several widely used KG embedding models and evaluate their performance and applications using a recent biomedical KG, BioKG. We also demonstrate that by using recent best practices for training KG embeddings, it is possible to improve performance over BioKG. Additionally, we address interpretability concerns that naturally arise with such machine learning methods. In particular, we examine rule-based methods that aim to address these concerns by making interpretable predictions using learned rules, achieving comparable performance. Finally, we discuss a realistic use case where a pretrained BioKG embedding is further trained for a specific task, in this case, four polypharmacy scenarios where the goal is to predict missing links or entities in another downstream KGs in four polypharmacy scenarios. We conclude that in the right scenarios, biomedical KG embeddings can be effective and useful.</p><p><strong>Availability and implementation: </strong>Our code and data is available at https://github.com/aryopg/biokge.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae097"},"PeriodicalIF":2.4,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-13eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae103
Chiara Rodella, Symela Lazaridi, Thomas Lemmin
Motivation: Understanding protein thermostability is essential for numerous biotechnological applications, but traditional experimental methods are time-consuming, expensive, and error-prone. Recently, deep learning (DL) techniques from natural language processing (NLP) was extended to the field of biology, since the primary sequence of proteins can be viewed as a string of amino acids that follow a physicochemical grammar.
Results: In this study, we developed TemBERTure, a DL framework that predicts thermostability class and melting temperature from protein sequences. Our findings emphasize the importance of data diversity for training robust models, especially by including sequences from a wider range of organisms. Additionally, we suggest using attention scores from Deep Learning models to gain deeper insights into protein thermostability. Analyzing these scores in conjunction with the 3D protein structure can enhance understanding of the complex interactions among amino acid properties, their positioning, and the surrounding microenvironment. By addressing the limitations of current prediction methods and introducing new exploration avenues, this research paves the way for more accurate and informative protein thermostability predictions, ultimately accelerating advancements in protein engineering.
Availability and implementation: TemBERTure model and the data are available at: https://github.com/ibmm-unibe-ch/TemBERTure.
{"title":"TemBERTure: advancing protein thermostability prediction with deep learning and attention mechanisms.","authors":"Chiara Rodella, Symela Lazaridi, Thomas Lemmin","doi":"10.1093/bioadv/vbae103","DOIUrl":"10.1093/bioadv/vbae103","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding protein thermostability is essential for numerous biotechnological applications, but traditional experimental methods are time-consuming, expensive, and error-prone. Recently, deep learning (DL) techniques from natural language processing (NLP) was extended to the field of biology, since the primary sequence of proteins can be viewed as a string of amino acids that follow a physicochemical grammar.</p><p><strong>Results: </strong>In this study, we developed TemBERTure, a DL framework that predicts thermostability class and melting temperature from protein sequences. Our findings emphasize the importance of data diversity for training robust models, especially by including sequences from a wider range of organisms. Additionally, we suggest using attention scores from Deep Learning models to gain deeper insights into protein thermostability. Analyzing these scores in conjunction with the 3D protein structure can enhance understanding of the complex interactions among amino acid properties, their positioning, and the surrounding microenvironment. By addressing the limitations of current prediction methods and introducing new exploration avenues, this research paves the way for more accurate and informative protein thermostability predictions, ultimately accelerating advancements in protein engineering.</p><p><strong>Availability and implementation: </strong>TemBERTure model and the data are available at: https://github.com/ibmm-unibe-ch/TemBERTure.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae103"},"PeriodicalIF":2.4,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11262459/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141749771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-11eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae098
Zehua T Zhou, Gregory L Owens, Wesley A Larson, Runyang Nicolas Lou, Peter H Sudmant
Summary: We developed loco-pipe, a Snakemake pipeline that seamlessly streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data. loco-pipe is highly automated, easily customizable, massively parallelized, and thus is a valuable tool for both new and experienced users of lcWGS.
Availability and implementation: loco-pipe is published under the GPLv3. It is freely available on GitHub (github.com/sudmantlab/loco-pipe) and archived on Zenodo (doi.org/10.5281/zenodo.10425920).
{"title":"loco-pipe: an automated pipeline for population genomics with low-coverage whole-genome sequencing.","authors":"Zehua T Zhou, Gregory L Owens, Wesley A Larson, Runyang Nicolas Lou, Peter H Sudmant","doi":"10.1093/bioadv/vbae098","DOIUrl":"10.1093/bioadv/vbae098","url":null,"abstract":"<p><strong>Summary: </strong>We developed loco-pipe, a Snakemake pipeline that seamlessly streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data. loco-pipe is highly automated, easily customizable, massively parallelized, and thus is a valuable tool for both new and experienced users of lcWGS.</p><p><strong>Availability and implementation: </strong>loco-pipe is published under the GPLv3. It is freely available on GitHub (github.com/sudmantlab/loco-pipe) and archived on Zenodo (doi.org/10.5281/zenodo.10425920).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae098"},"PeriodicalIF":2.4,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11246161/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-04eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae100
Matthias Mattanovich, Viktor Hesselberg-Thomsen, Annette Lien, Dovydas Vaitkus, Victoria Sara Saad, Douglas McCloskey
Motivation: INCA is a powerful tool for metabolic flux analysis, however, import and export of data and results can be tedious and limit the use of INCA in automated workflows.
Results: The INCAWrapper enables the use of INCA purely through Python, which allows the use of INCA in common data science workflows.
Availability and implementation: The INCAWrapper is implemented in Python and can be found at https://github.com/biosustain/incawrapper. It is freely available under an MIT License. To run INCA, the user needs their own MATLAB and INCA licenses. INCA is freely available for noncommercial use at mfa.vueinnovations.com.
{"title":"INCAWrapper: a Python wrapper for INCA for seamless data import, -export, and -processing.","authors":"Matthias Mattanovich, Viktor Hesselberg-Thomsen, Annette Lien, Dovydas Vaitkus, Victoria Sara Saad, Douglas McCloskey","doi":"10.1093/bioadv/vbae100","DOIUrl":"10.1093/bioadv/vbae100","url":null,"abstract":"<p><strong>Motivation: </strong>INCA is a powerful tool for metabolic flux analysis, however, import and export of data and results can be tedious and limit the use of INCA in automated workflows.</p><p><strong>Results: </strong>The INCAWrapper enables the use of INCA purely through Python, which allows the use of INCA in common data science workflows.</p><p><strong>Availability and implementation: </strong>The INCAWrapper is implemented in Python and can be found at https://github.com/biosustain/incawrapper. It is freely available under an MIT License. To run INCA, the user needs their own MATLAB and INCA licenses. INCA is freely available for noncommercial use at mfa.vueinnovations.com.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae100"},"PeriodicalIF":2.4,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11245311/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae087
Jaewoo Lee, Mehita Achuthan, Lucas Chen, Paulina Carmona-Mora
Summary: A problem spanning across many research fields is that processed data and research results are often scattered, which makes data access, analysis, extraction, and team sharing more challenging. We have developed a platform for researchers to easily manage tabular data with features like browsing, bookmarking, and linking to external open knowledge bases. The source code, originally designed for genomics research, is customizable for use by other fields or data, providing a no- to low-cost DIY system for research teams.
Availability and implementation: The source code of our DIY app is available on https://github.com/Carmona-MoraUCD/Human-Genomics-Browser. It can be downloaded and run by anyone with a web browser, Python3, and Node.js on their machine. The web application is licensed under the MIT license.
{"title":"A customizable secure DIY web application for accessing, sharing, and browsing aggregate experimental results and metadata.","authors":"Jaewoo Lee, Mehita Achuthan, Lucas Chen, Paulina Carmona-Mora","doi":"10.1093/bioadv/vbae087","DOIUrl":"10.1093/bioadv/vbae087","url":null,"abstract":"<p><strong>Summary: </strong>A problem spanning across many research fields is that processed data and research results are often scattered, which makes data access, analysis, extraction, and team sharing more challenging. We have developed a platform for researchers to easily manage tabular data with features like browsing, bookmarking, and linking to external open knowledge bases. The source code, originally designed for genomics research, is customizable for use by other fields or data, providing a no- to low-cost DIY system for research teams.</p><p><strong>Availability and implementation: </strong>The source code of our DIY app is available on https://github.com/Carmona-MoraUCD/Human-Genomics-Browser. It can be downloaded and run by anyone with a web browser, Python3, and Node.js on their machine. The web application is licensed under the MIT license.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae087"},"PeriodicalIF":2.4,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257709/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}