Pub Date : 2022-08-26eCollection Date: 2022-09-01DOI: 10.1515/jib-2022-0006
Delora Baptista, João Correia, Bruno Pereira, Miguel Rocha
Machine learning (ML) is increasingly being used to guide drug discovery processes. When applying ML approaches to chemical datasets, molecular descriptors and fingerprints are typically used to represent compounds as numerical vectors. However, in recent years, end-to-end deep learning (DL) methods that can learn feature representations directly from line notations or molecular graphs have been proposed as alternatives to using precomputed features. This study set out to investigate which compound representation methods are the most suitable for drug sensitivity prediction in cancer cell lines. Twelve different representations were benchmarked on 5 compound screening datasets, using DeepMol, a new chemoinformatics package developed by our research group, to perform these analyses. The results of this study show that the predictive performance of end-to-end DL models is comparable to, and at times surpasses, that of models trained on molecular fingerprints, even when less training data is available. This study also found that combining several compound representation methods into an ensemble can improve performance. Finally, we show that a post hoc feature attribution method can boost the explainability of the DL models.
{"title":"Evaluating molecular representations in machine learning models for drug response prediction and interpretability.","authors":"Delora Baptista, João Correia, Bruno Pereira, Miguel Rocha","doi":"10.1515/jib-2022-0006","DOIUrl":"10.1515/jib-2022-0006","url":null,"abstract":"<p><p>Machine learning (ML) is increasingly being used to guide drug discovery processes. When applying ML approaches to chemical datasets, molecular descriptors and fingerprints are typically used to represent compounds as numerical vectors. However, in recent years, end-to-end deep learning (DL) methods that can learn feature representations directly from line notations or molecular graphs have been proposed as alternatives to using precomputed features. This study set out to investigate which compound representation methods are the most suitable for drug sensitivity prediction in cancer cell lines. Twelve different representations were benchmarked on 5 compound screening datasets, using DeepMol, a new chemoinformatics package developed by our research group, to perform these analyses. The results of this study show that the predictive performance of end-to-end DL models is comparable to, and at times surpasses, that of models trained on molecular fingerprints, even when less training data is available. This study also found that combining several compound representation methods into an ensemble can improve performance. Finally, we show that a <i>post hoc</i> feature attribution method can boost the explainability of the DL models.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 3","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33438674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bioinformatics applies computer science approaches to the analysis of biological data. It is widely known for its genomics-based analysis approaches that have supported, for example, the 1000 Genomes Project. In addition, bioinformatics relates to many other areas, such as analysis of microscopic images (e.g., organelle localization), molecular modelling (e.g., proteins, biological membranes), and visualization of biological networks (e.g., protein-protein interaction networks, metabolism). Design is a highly interdisciplinary field that incorporates aspects such as aesthetic, economic, functional, philosophical, and/or socio-political considerations into the creative process and is usually determined by context. While visualization plays a critical role in bioinformatics, as reflected in a number of conferences and workshops in the field, design in bioinformatics-related research contexts in particular is not as well studied. With this special issue in conjunction with an international workshop, we aim to bring together bioinformaticians from different fields with designers, design researchers, and medical and scientific illustrators to discuss future challenges in the context of bioinformatics and design.
{"title":"Design X Bioinformatics: a community-driven initiative to connect bioinformatics and design.","authors":"Björn Sommer, Daisuke Inoue, Marc Baaden","doi":"10.1515/jib-2022-0037","DOIUrl":"https://doi.org/10.1515/jib-2022-0037","url":null,"abstract":"<p><p>Bioinformatics applies computer science approaches to the analysis of biological data. It is widely known for its genomics-based analysis approaches that have supported, for example, the 1000 Genomes Project. In addition, bioinformatics relates to many other areas, such as analysis of microscopic images (e.g., organelle localization), molecular modelling (e.g., proteins, biological membranes), and visualization of biological networks (e.g., protein-protein interaction networks, metabolism). Design is a highly interdisciplinary field that incorporates aspects such as aesthetic, economic, functional, philosophical, and/or socio-political considerations into the creative process and is usually determined by context. While visualization plays a critical role in bioinformatics, as reflected in a number of conferences and workshops in the field, design in bioinformatics-related research contexts in particular is not as well studied. With this special issue in conjunction with an international workshop, we aim to bring together bioinformaticians from different fields with designers, design researchers, and medical and scientific illustrators to discuss future challenges in the context of bioinformatics and design.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40527885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-19eCollection Date: 2022-09-01DOI: 10.1515/jib-2022-0003
Mohd Izzat Yong, Mohd Saberi Mohamad, Yee Wen Choon, Weng Howe Chan, Hasyiya Karimah Adli, Khairul Nizar Syazwan Wsw, Nooraini Yusoff, Muhammad Akmal Remli
Metabolic engineering has expanded in importance and employment in recent years and is now extensively applied particularly in the production of biomass from microbes. Metabolic network models have been employed extravagantly in computational processes developed to enhance metabolic production and suggest changes in organisms. The crucial issue has been the unrealistic flux distribution presented in prior work on rational modelling framework adopting Optknock and OptGene. In order to address the problem, a hybrid of Bees Algorithm and Regulatory On/Off Minimization (BAROOM) is used. By employing Escherichia coli as the model organism, the most excellent set of genes in E. coli that can be removed and advance the production of succinate can be decided. Evidences shows that BAROOM outperforms alternative strategies used to escalate in succinate production in model organisms like E. coli by selecting the best set of genes to be removed.
{"title":"A hybrid of Bees algorithm and regulatory on/off minimization for optimizing lactate and succinate production.","authors":"Mohd Izzat Yong, Mohd Saberi Mohamad, Yee Wen Choon, Weng Howe Chan, Hasyiya Karimah Adli, Khairul Nizar Syazwan Wsw, Nooraini Yusoff, Muhammad Akmal Remli","doi":"10.1515/jib-2022-0003","DOIUrl":"https://doi.org/10.1515/jib-2022-0003","url":null,"abstract":"<p><p>Metabolic engineering has expanded in importance and employment in recent years and is now extensively applied particularly in the production of biomass from microbes. Metabolic network models have been employed extravagantly in computational processes developed to enhance metabolic production and suggest changes in organisms. The crucial issue has been the unrealistic flux distribution presented in prior work on rational modelling framework adopting Optknock and OptGene. In order to address the problem, a hybrid of Bees Algorithm and Regulatory On/Off Minimization (BAROOM) is used. By employing <i>Escherichia coli</i> as the model organism, the most excellent set of genes in <i>E. coli</i> that can be removed and advance the production of succinate can be decided. Evidences shows that BAROOM outperforms alternative strategies used to escalate in succinate production in model organisms like <i>E. coli</i> by selecting the best set of genes to be removed.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521821/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40518181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-12eCollection Date: 2022-09-01DOI: 10.1515/jib-2021-0036
Simon Orozco-Arias, Mariana S Candamil-Cortes, Paula A Jaimes, Estiven Valencia-Castrillon, Reinel Tabares-Soto, Gustavo Isaza, Romain Guyot
Transposable elements are mobile sequences that can move and insert themselves into chromosomes, activating under internal or external stimuli, giving the organism the ability to adapt to the environment. Annotating transposable elements in genomic data is currently considered a crucial task to understand key aspects of organisms such as phenotype variability, species evolution, and genome size, among others. Because of the way they replicate, LTR retrotransposons are the most common transposable elements in plants, accounting in some cases for up to 80% of all DNA information. To annotate these elements, a reference library is usually created, a curation process is performed, eliminating TE fragments and false positives and then annotated in the genome using the homology method. However, the curation process can take weeks, requires extensive manual work and the execution of multiple time-consuming bioinformatics software. Here, we propose a machine learning-based approach to perform this process automatically on plant genomes, obtaining up to 91.18% F1-score. This approach was tested with four plant species, obtaining up to 93.6% F1-score (Oryza granulata) in only 22.61 s, where bioinformatics methods took approximately 6 h. This acceleration demonstrates that the ML-based approach is efficient and could be used in massive sequencing projects.
{"title":"Automatic curation of LTR retrotransposon libraries from plant genomes through machine learning.","authors":"Simon Orozco-Arias, Mariana S Candamil-Cortes, Paula A Jaimes, Estiven Valencia-Castrillon, Reinel Tabares-Soto, Gustavo Isaza, Romain Guyot","doi":"10.1515/jib-2021-0036","DOIUrl":"https://doi.org/10.1515/jib-2021-0036","url":null,"abstract":"<p><p>Transposable elements are mobile sequences that can move and insert themselves into chromosomes, activating under internal or external stimuli, giving the organism the ability to adapt to the environment. Annotating transposable elements in genomic data is currently considered a crucial task to understand key aspects of organisms such as phenotype variability, species evolution, and genome size, among others. Because of the way they replicate, LTR retrotransposons are the most common transposable elements in plants, accounting in some cases for up to 80% of all DNA information. To annotate these elements, a reference library is usually created, a curation process is performed, eliminating TE fragments and false positives and then annotated in the genome using the homology method. However, the curation process can take weeks, requires extensive manual work and the execution of multiple time-consuming bioinformatics software. Here, we propose a machine learning-based approach to perform this process automatically on plant genomes, obtaining up to 91.18% F1-score. This approach was tested with four plant species, obtaining up to 93.6% F1-score (<i>Oryza granulata</i>) in only 22.61 s, where bioinformatics methods took approximately 6 h. This acceleration demonstrates that the ML-based approach is efficient and could be used in massive sequencing projects.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40498603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Among the many properties of proteins, sugars, nucleic acids, membranes and other cellular components, color is not present. At the same time, we humans have a natural ability of recognizing and appreciating colors, and use them generously, with the aim of both delivering information and pleasing the eyes. In this article, I suggest how we can conciliate these two situations, with the contribution of biologists, artists, and computer graphics and perception experts. The concept can be developed in a series of initiatives involving the community, including discussion sessions, technical challenges, experimental studies and outreach activities.
{"title":"Colors in the representation of biological structures.","authors":"Monica Zoppè","doi":"10.1515/jib-2022-0021","DOIUrl":"10.1515/jib-2022-0021","url":null,"abstract":"<p><p>Among the many properties of proteins, sugars, nucleic acids, membranes and other cellular components, color is not present. At the same time, we humans have a natural ability of recognizing and appreciating colors, and use them generously, with the aim of both delivering information and pleasing the eyes. In this article, I suggest how we can conciliate these two situations, with the contribution of biologists, artists, and computer graphics and perception experts. The concept can be developed in a series of initiatives involving the community, including discussion sessions, technical challenges, experimental studies and outreach activities.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377705/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40562034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual representations are commonly used to explore, analyse, and communicate information and knowledge in systems biology and beyond. Such visualisations not only need to be accurate but should also be aesthetically pleasing and informative. Using the example of the Systems Biology Graphical Notation (SBGN) we will investigate design considerations for graphically presenting information from systems biology, in particular regarding the use of glyphs for types of information, the style of graph layout for network representation, and the concept of bricks for visual network creation.
{"title":"Design considerations for representing systems biology information with the Systems Biology Graphical Notation.","authors":"Falk Schreiber, Tobias Czauderna","doi":"10.1515/jib-2022-0024","DOIUrl":"https://doi.org/10.1515/jib-2022-0024","url":null,"abstract":"<p><p>Visual representations are commonly used to explore, analyse, and communicate information and knowledge in systems biology and beyond. Such visualisations not only need to be accurate but should also be aesthetically pleasing and informative. Using the example of the Systems Biology Graphical Notation (SBGN) we will investigate design considerations for graphically presenting information from systems biology, in particular regarding the use of glyphs for types of information, the style of graph layout for network representation, and the concept of bricks for visual network creation.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40470351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Davide Spalvieri, Anne-Marine Mauviel, Matthieu Lambert, Nicolas Férey, Sophie Sacquin-Mora, Matthieu Chavent, Marc Baaden
We discuss how design enriches molecular science, particularly structural biology and bioinformatics. We present two use cases, one in academic practice and the other to design for outreach. The first case targets the representation of ion channels and their dynamic properties. In the second, we document a transition process from a research environment to general-purpose designs. Several testimonials from practitioners are given. By describing the design process of abstracted shapes, exploded views of molecular structures, motion-averaged slices, 360-degree panoramic projections, and experiments with lit sphere shading, we document how designers help make scientific data accessible without betraying its meaning, and how a creative mind adds value over purely data-driven visualizations. A similar conclusion was drawn for public outreach, as we found that comic-book-style drawings are better suited for communicating science to a broad audience.
{"title":"Design - a new way to look at old molecules.","authors":"Davide Spalvieri, Anne-Marine Mauviel, Matthieu Lambert, Nicolas Férey, Sophie Sacquin-Mora, Matthieu Chavent, Marc Baaden","doi":"10.1515/jib-2022-0020","DOIUrl":"https://doi.org/10.1515/jib-2022-0020","url":null,"abstract":"<p><p>We discuss how design enriches molecular science, particularly structural biology and bioinformatics. We present two use cases, one in academic practice and the other to design for outreach. The first case targets the representation of ion channels and their dynamic properties. In the second, we document a transition process from a research environment to general-purpose designs. Several testimonials from practitioners are given. By describing the design process of abstracted shapes, exploded views of molecular structures, motion-averaged slices, 360-degree panoramic projections, and experiments with lit sphere shading, we document how designers help make scientific data accessible without betraying its meaning, and how a creative mind adds value over purely data-driven visualizations. A similar conclusion was drawn for public outreach, as we found that comic-book-style drawings are better suited for communicating science to a broad audience.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377703/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40563696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data from genomics, proteomics, structural biology and cryo-electron microscopy are integrated into a structural illustration of a cross section through an entire JCVI-syn3.0 minimal cell. The illustration is designed with several goals: to inspire excitement in science, to depict the underlying scientific results accurately, and to be feasible in traditional media. Design choices to achieve these goals include reduction of visual complexity with simplified representations, use of orthographic projection to retain scale relationships, and an approach to color that highlights functional compartments of the cell. Given that this simple cell provides an attractive laboratory for exploring the central processes needed for life, several functional narratives are included in the illustration, including division of the cell and the first depiction of an entire cellular proteome. The illustration lays the foundation for 3D molecular modeling of this cell.
{"title":"Integrative illustration of a JCVI-syn3A minimal cell.","authors":"David S Goodsell","doi":"10.1515/jib-2022-0013","DOIUrl":"10.1515/jib-2022-0013","url":null,"abstract":"<p><p>Data from genomics, proteomics, structural biology and cryo-electron microscopy are integrated into a structural illustration of a cross section through an entire JCVI-syn3.0 minimal cell. The illustration is designed with several goals: to inspire excitement in science, to depict the underlying scientific results accurately, and to be feasible in traditional media. Design choices to achieve these goals include reduction of visual complexity with simplified representations, use of orthographic projection to retain scale relationships, and an approach to color that highlights functional compartments of the cell. Given that this simple cell provides an attractive laboratory for exploring the central processes needed for life, several functional narratives are included in the illustration, including division of the cell and the first depiction of an entire cellular proteome. The illustration lays the foundation for 3D molecular modeling of this cell.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377704/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40395611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biomedical illustration and visualization techniques provide a window into complex molecular worlds that are difficult to capture through experimental means alone. Biomedical illustrators frequently employ color to help tell a molecular story, e.g., to identify key molecules in a signaling pathway. Currently, color use for molecules is largely arbitrary and often chosen based on the client, cultural factors, or personal taste. The study of molecular dynamics is relatively young, and some stakeholders argue that color use guidelines would throttle the growth of the field. Instead, content authors have ample creative freedom to choose an aesthetic that, e.g., supports the story they want to tell. However, such creative freedom comes at a price. The color design process is challenging, particularly for those without a background in color theory. The result is a semantically inconsistent color space that reduces the interpretability and effectiveness of molecular visualizations as a whole. Our contribution in this paper is threefold. We first discuss some of the factors that contribute to this array of color palettes. Second, we provide a brief sampling of color palettes used in both industry and research sectors. Lastly, we suggest considerations for developing best practices around color palettes applied to molecular visualization.
{"title":"Considering best practices in color palettes for molecular visualizations.","authors":"Laura Garrison, Stefan Bruckner","doi":"10.1515/jib-2022-0016","DOIUrl":"https://doi.org/10.1515/jib-2022-0016","url":null,"abstract":"<p><p>Biomedical illustration and visualization techniques provide a window into complex molecular worlds that are difficult to capture through experimental means alone. Biomedical illustrators frequently employ color to help tell a molecular story, e.g., to identify key molecules in a signaling pathway. Currently, color use for molecules is largely arbitrary and often chosen based on the client, cultural factors, or personal taste. The study of molecular dynamics is relatively young, and some stakeholders argue that color use guidelines would throttle the growth of the field. Instead, content authors have ample creative freedom to choose an aesthetic that, e.g., supports the story they want to tell. However, such creative freedom comes at a price. The color design process is challenging, particularly for those without a background in color theory. The result is a semantically inconsistent color space that reduces the interpretability and effectiveness of molecular visualizations as a whole. Our contribution in this paper is threefold. We first discuss some of the factors that contribute to this array of color palettes. Second, we provide a brief sampling of color palettes used in both industry and research sectors. Lastly, we suggest considerations for developing best practices around color palettes applied to molecular visualization.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":"19 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9377702/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40192929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The prediction of adverse drug reactions (ADR) is an important step of drug discovery and design process. Different drug properties have been employed for ADR prediction but the prediction capability of drug properties and drug functions in integrated manner is yet to be explored. In the present work, a multi-label deep neural network and MLSMOTE based methodology has been proposed for ADR prediction. The proposed methodology has been applied on SMILES Strings data of drugs, 17 molecular descriptors data of drugs and drug functions data individually and in integrated manner for ADR prediction. The experimental results shows that the SMILES Strings + drug functions has outperformed other types of data with regards to ADR prediction capability.
{"title":"Integrative analysis of chemical properties and functions of drugs for adverse drug reaction prediction based on multi-label deep neural network","authors":"Pranab Das, Yogita, V. Pal","doi":"10.1515/jib-2022-0007","DOIUrl":"https://doi.org/10.1515/jib-2022-0007","url":null,"abstract":"Abstract The prediction of adverse drug reactions (ADR) is an important step of drug discovery and design process. Different drug properties have been employed for ADR prediction but the prediction capability of drug properties and drug functions in integrated manner is yet to be explored. In the present work, a multi-label deep neural network and MLSMOTE based methodology has been proposed for ADR prediction. The proposed methodology has been applied on SMILES Strings data of drugs, 17 molecular descriptors data of drugs and drug functions data individually and in integrated manner for ADR prediction. The experimental results shows that the SMILES Strings + drug functions has outperformed other types of data with regards to ADR prediction capability.","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.9,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49603676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}