Pub Date : 2023-08-03eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1225807
Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar
A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.
{"title":"Molecular timetrees using relaxed clocks and uncertain phylogenies.","authors":"Jose Barba-Montoya, Sudip Sharma, Sudhir Kumar","doi":"10.3389/fbinf.2023.1225807","DOIUrl":"10.3389/fbinf.2023.1225807","url":null,"abstract":"<p><p>A common practice in molecular systematics is to infer phylogeny and then scale it to time by using a relaxed clock method and calibrations. This sequential analysis practice ignores the effect of phylogenetic uncertainty on divergence time estimates and their confidence/credibility intervals. An alternative is to infer phylogeny and times jointly to incorporate phylogenetic errors into molecular dating. We compared the performance of these two alternatives in reconstructing evolutionary timetrees using computer-simulated and empirical datasets. We found sequential and joint analyses to produce similar divergence times and phylogenetic relationships, except for some nodes in particular cases. The joint inference performed better when the phylogeny was not well resolved, situations in which the joint inference should be preferred. However, joint inference can be infeasible for large datasets because available Bayesian methods are computationally burdensome. We present an alternative approach for joint inference that combines the bag of little bootstraps, maximum likelihood, and RelTime approaches for simultaneously inferring evolutionary relationships, divergence times, and confidence intervals, incorporating phylogeny uncertainty. The new method alleviates the high computational burden imposed by Bayesian methods while achieving a similar result.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1225807"},"PeriodicalIF":0.0,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-02eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1164482
Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang
Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the "black box" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.
{"title":"XMR: an explainable multimodal neural network for drug response prediction.","authors":"Zihao Wang, Yun Zhou, Yu Zhang, Yu K Mo, Yijie Wang","doi":"10.3389/fbinf.2023.1164482","DOIUrl":"10.3389/fbinf.2023.1164482","url":null,"abstract":"<p><p><b>Introduction:</b> Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the \"black box\" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. <b>Methods:</b> In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. <b>Results:</b> We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. <b>Discussion:</b> Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1164482"},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10433751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10039829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-27eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1234218
Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay
Introduction: The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (Danio rerio), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. Methods and results: Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. Conclusions: These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.
{"title":"3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism.","authors":"Ryan S McClure, Yvonne Rericha, Katrina M Waters, Robyn L Tanguay","doi":"10.3389/fbinf.2023.1234218","DOIUrl":"10.3389/fbinf.2023.1234218","url":null,"abstract":"<p><p><b>Introduction:</b> The application of RNA-sequencing has led to numerous breakthroughs related to investigating gene expression levels in complex biological systems. Among these are knowledge of how organisms, such as the vertebrate model organism zebrafish (<i>Danio rerio</i>), respond to toxicant exposure. Recently, the development of 3' RNA-seq has allowed for the determination of gene expression levels with a fraction of the required reads compared to standard RNA-seq. While 3' RNA-seq has many advantages, a comparison to standard RNA-seq has not been performed in the context of whole organism toxicity and sparse data. <b>Methods and results:</b> Here, we examined samples from zebrafish exposed to perfluorobutane sulfonamide (FBSA) with either 3' or standard RNA-seq to determine the advantages of each with regards to the identification of functionally enriched pathways. We found that 3' and standard RNA-seq showed specific advantages when focusing on annotated or unannotated regions of the genome. We also found that standard RNA-seq identified more differentially expressed genes (DEGs), but that this advantage disappeared under conditions of sparse data. We also found that standard RNA-seq had a significant advantage in identifying functionally enriched pathways via analysis of DEG lists but that this advantage was minimal when identifying pathways via gene set enrichment analysis of all genes. <b>Conclusions:</b> These results show that each approach has experimental conditions where they may be advantageous. Our observations can help guide others in the choice of 3' RNA-seq vs standard RNA sequencing to query gene expression levels in a range of biological systems.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1234218"},"PeriodicalIF":2.8,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9990456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-26eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1243663
Michael John Fanous, Nir Pillar, Aydogan Ozcan
Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational "virtual" staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.
{"title":"Digital staining facilitates biomedical microscopy.","authors":"Michael John Fanous, Nir Pillar, Aydogan Ozcan","doi":"10.3389/fbinf.2023.1243663","DOIUrl":"10.3389/fbinf.2023.1243663","url":null,"abstract":"<p><p>Traditional staining of biological specimens for microscopic imaging entails time-consuming, laborious, and costly procedures, in addition to producing inconsistent labeling and causing irreversible sample damage. In recent years, computational \"virtual\" staining using deep learning techniques has evolved into a robust and comprehensive application for streamlining the staining process without typical histochemical staining-related drawbacks. Such virtual staining techniques can also be combined with neural networks designed to correct various microscopy aberrations, such as out-of-focus or motion blur artifacts, and improve upon diffracted-limited resolution. Here, we highlight how such methods lead to a host of new opportunities that can significantly improve both sample preparation and imaging in biomedical microscopy.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1243663"},"PeriodicalIF":2.8,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10411189/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9969679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-26eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1159381
Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch
<p><p>Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial "omics" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological ("simple") ima
{"title":"A review on deep learning applications in highly multiplexed tissue imaging data analysis.","authors":"Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch","doi":"10.3389/fbinf.2023.1159381","DOIUrl":"10.3389/fbinf.2023.1159381","url":null,"abstract":"<p><p>Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") images, as the former can provide deep mechanistic insights that cannot be obtained by the latter, even with the aid of explainable AI. Furthermore, we provide the reader with the advantages/disadvantages of DL-based pipelines used in preprocessing highly multiplexed images (cell segmentation, cell type annotation). Therefore, this review also guides the reader to choose the DL-based pipeline that best fits their data. In conclusion, DL continues to be established as an essential tool in discovering novel biological mechanisms when combined with technologies such as highly multiplexed tissue imaging data. In balance with conventional medical data, its role in clinical routine will become more important, supporting diagnosis and prognosis in oncology, enhancing clinical decision-making, and improving the quality of care for patients. Since its introduction into the field of oncology, deep learning (DL) has impacted clinical discoveries and biomarker predictions. DL-driven discoveries and predictions in oncology are based on a variety of biological data such as genomics, proteomics, and imaging data. DL-based computational frameworks can predict genetic variant effects on gene expression, as well as protein structures based on amino acid sequences. Furthermore, DL algorithms can capture valuable mechanistic biological information from several spatial \"omics\" technologies, such as spatial transcriptomics and spatial proteomics. Here, we review the impact that the combination of artificial intelligence (AI) with spatial omics technologies has had on oncology, focusing on DL and its applications in biomedical image analysis, encompassing cell segmentation, cell phenotype identification, cancer prognostication, and therapy prediction. We highlight the advantages of using highly multiplexed images (spatial proteomics data) compared to single-stained, conventional histopathological (\"simple\") ima","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1159381"},"PeriodicalIF":2.8,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10410935/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9978648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-25eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1233748
Wei Ouyang, Kevin W Eliceiri, Beth A Cimini
As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.
{"title":"Moving beyond the desktop: prospects for practical bioimage analysis via the web.","authors":"Wei Ouyang, Kevin W Eliceiri, Beth A Cimini","doi":"10.3389/fbinf.2023.1233748","DOIUrl":"10.3389/fbinf.2023.1233748","url":null,"abstract":"<p><p>As biological imaging continues to rapidly advance, it results in increasingly complex image data, necessitating a reevaluation of conventional bioimage analysis methods and their accessibility. This perspective underscores our belief that a transition from desktop-based tools to web-based bioimage analysis could unlock immense opportunities for improved accessibility, enhanced collaboration, and streamlined workflows. We outline the potential benefits, such as reduced local computational demands and solutions to common challenges, including software installation issues and limited reproducibility. Furthermore, we explore the present state of web-based tools, hurdles in implementation, and the significance of collective involvement from the scientific community in driving this transition. In acknowledging the potential roadblocks and complexity of data management, we suggest a combined approach of selective prototyping and large-scale workflow application for optimal usage. Embracing web-based bioimage analysis could pave the way for the life sciences community to accelerate biological research, offering a robust platform for a more collaborative, efficient, and democratized science.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1233748"},"PeriodicalIF":2.8,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10409478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10005434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1216362
Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco
Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.
{"title":"Geometric deep learning as a potential tool for antimicrobial peptide prediction.","authors":"Fabiano C Fernandes, Marlon H Cardoso, Abel Gil-Ley, Lívia V Luchi, Maria G L da Silva, Maria L R Macedo, Cesar de la Fuente-Nunez, Octavio L Franco","doi":"10.3389/fbinf.2023.1216362","DOIUrl":"10.3389/fbinf.2023.1216362","url":null,"abstract":"<p><p>Antimicrobial peptides (AMPs) are components of natural immunity against invading pathogens. They are polymers that fold into a variety of three-dimensional structures, enabling their function, with an underlying sequence that is best represented in a non-flat space. The structural data of AMPs exhibits non-Euclidean characteristics, which means that certain properties, e.g., differential manifolds, common system of coordinates, vector space structure, or translation-equivariance, along with basic operations like convolution, in non-Euclidean space are not distinctly established. Geometric deep learning (GDL) refers to a category of machine learning methods that utilize deep neural models to process and analyze data in non-Euclidean settings, such as graphs and manifolds. This emerging field seeks to expand the use of structured models to these domains. This review provides a detailed summary of the latest developments in designing and predicting AMPs utilizing GDL techniques and also discusses both current research gaps and future directions in the field.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1216362"},"PeriodicalIF":2.8,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10374423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-05eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1194993
Adriano Lucieri, Andreas Dengel, Sheraz Ahmed
Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.
{"title":"Translating theory into practice: assessing the privacy implications of concept-based explanations for biomedical AI.","authors":"Adriano Lucieri, Andreas Dengel, Sheraz Ahmed","doi":"10.3389/fbinf.2023.1194993","DOIUrl":"10.3389/fbinf.2023.1194993","url":null,"abstract":"<p><p>Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1194993"},"PeriodicalIF":2.8,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10356902/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9918469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-29eCollection Date: 2023-01-01DOI: 10.3389/fbinf.2023.1210157
Linghao Hu, Blanche Ter Hofstede, Dhavan Sharma, Feng Zhao, Alex J Walsh
Introduction: Autofluorescence imaging of the coenzymes reduced nicotinamide (phosphate) dinucleotide (NAD(P)H) and oxidized flavin adenine dinucleotide (FAD) provides a label-free method to detect cellular metabolism and phenotypes. Time-domain fluorescence lifetime data can be analyzed by exponential decay fitting to extract fluorescence lifetimes or by a fit-free phasor transformation to compute phasor coordinates. Methods: Here, fluorescence lifetime data analysis by biexponential decay curve fitting is compared with phasor coordinate analysis as input data to machine learning models to predict cell phenotypes. Glycolysis and oxidative phosphorylation of MCF7 breast cancer cells were chemically inhibited with 2-deoxy-d-glucose and sodium cyanide, respectively; and fluorescence lifetime images of NAD(P)H and FAD were obtained using a multiphoton microscope. Results: Machine learning algorithms built from either the extracted lifetime values or phasor coordinates predict MCF7 metabolism with a high accuracy (∼88%). Similarly, fluorescence lifetime images of M0, M1, and M2 macrophages were acquired and analyzed by decay fitting and phasor analysis. Machine learning models trained with features from curve fitting discriminate different macrophage phenotypes with improved performance over models trained using only phasor coordinates. Discussion: Altogether, the results demonstrate that both curve fitting and phasor analysis of autofluorescence lifetime images can be used in machine learning models for classification of cell phenotype from the lifetime data.
{"title":"Comparison of phasor analysis and biexponential decay curve fitting of autofluorescence lifetime imaging data for machine learning prediction of cellular phenotypes.","authors":"Linghao Hu, Blanche Ter Hofstede, Dhavan Sharma, Feng Zhao, Alex J Walsh","doi":"10.3389/fbinf.2023.1210157","DOIUrl":"10.3389/fbinf.2023.1210157","url":null,"abstract":"<p><p><b>Introduction:</b> Autofluorescence imaging of the coenzymes reduced nicotinamide (phosphate) dinucleotide (NAD(P)H) and oxidized flavin adenine dinucleotide (FAD) provides a label-free method to detect cellular metabolism and phenotypes. Time-domain fluorescence lifetime data can be analyzed by exponential decay fitting to extract fluorescence lifetimes or by a fit-free phasor transformation to compute phasor coordinates. <b>Methods:</b> Here, fluorescence lifetime data analysis by biexponential decay curve fitting is compared with phasor coordinate analysis as input data to machine learning models to predict cell phenotypes. Glycolysis and oxidative phosphorylation of MCF7 breast cancer cells were chemically inhibited with 2-deoxy-d-glucose and sodium cyanide, respectively; and fluorescence lifetime images of NAD(P)H and FAD were obtained using a multiphoton microscope. <b>Results:</b> Machine learning algorithms built from either the extracted lifetime values or phasor coordinates predict MCF7 metabolism with a high accuracy (∼88%). Similarly, fluorescence lifetime images of M0, M1, and M2 macrophages were acquired and analyzed by decay fitting and phasor analysis. Machine learning models trained with features from curve fitting discriminate different macrophage phenotypes with improved performance over models trained using only phasor coordinates. <b>Discussion:</b> Altogether, the results demonstrate that both curve fitting and phasor analysis of autofluorescence lifetime images can be used in machine learning models for classification of cell phenotype from the lifetime data.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1210157"},"PeriodicalIF":0.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10342207/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9822065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analyses. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess can predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.
{"title":"DeepRaccess: high-speed RNA accessibility prediction using deep learning","authors":"Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada","doi":"10.1101/2023.05.25.542237","DOIUrl":"https://doi.org/10.1101/2023.05.25.542237","url":null,"abstract":"RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analyses. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess can predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84758482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}