Tim Cramer, B. Kosmynin, Simon Moll, Manoel Römmer, E. Focht, Matthias S. Müller
{"title":"Evaluating the Performance of OpenMP Offloading on the NEC SX-Aurora TSUBASA Vector Engine","authors":"Tim Cramer, B. Kosmynin, Simon Moll, Manoel Römmer, E. Focht, Matthias S. Müller","doi":"10.14529/jsfi210204","DOIUrl":"https://doi.org/10.14529/jsfi210204","url":null,"abstract":"","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115227541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The fog computing paradigm has become prominent in stream processing for IoT systems where cloud computing struggles from high latency challenges. It enables the deployment of computational resources between the edge and cloud layers and helps to resolve constraints, primarily due to the need to react in real-time to state changes, improve the locality of data storage, and overcome external communication channels’ limitations. There is an urgent need for tools and platforms to model, implement, manage, and monitor complex fog computing workflows. Traditional scientific workflow management systems (SWMSs) provide modularity and flexibility to design, execute, and monitor complex computational workflows used in smart industry applications. However, they are mainly focused on batch execution of jobs consisting of tightly coupled tasks. Integrating data streams into SWMSs of IoT systems is challenging. We proposed a microworkflow model to redesign the monolith architecture of workflow systems into a set of smaller and independent workflows that support stream processing. Micro-workflow is an independent data stream processing service that can be deployed on different layers of the fog computing environment. To validate the feasibility and practicability of the micro-workflow refactoring, we provide intensive experimental analysis evaluating the interval between sensor messages, the time interval required to create a message, between sending sensor message and receiving the message in SWMS, including data serialization, network latency, etc. We show that the proposed decoupling support of the independence of implementation, execution, development, maintenance, and cross-platform deployment, where each micro-workflow becomes a standalone computational unit, is a suitable mechanism for IoT stream processing.
{"title":"Micro-Workflows Data Stream Processing Model for Industrial Internet of Things","authors":"Ameer B. A. Alaasam, G. Radchenko, A. Tchernykh","doi":"10.14529/jsfi210106","DOIUrl":"https://doi.org/10.14529/jsfi210106","url":null,"abstract":"The fog computing paradigm has become prominent in stream processing for IoT systems where cloud computing struggles from high latency challenges. It enables the deployment of computational resources between the edge and cloud layers and helps to resolve constraints, primarily due to the need to react in real-time to state changes, improve the locality of data storage, and overcome external communication channels’ limitations. There is an urgent need for tools and platforms to model, implement, manage, and monitor complex fog computing workflows. Traditional scientific workflow management systems (SWMSs) provide modularity and flexibility to design, execute, and monitor complex computational workflows used in smart industry applications. However, they are mainly focused on batch execution of jobs consisting of tightly coupled tasks. Integrating data streams into SWMSs of IoT systems is challenging. We proposed a microworkflow model to redesign the monolith architecture of workflow systems into a set of smaller and independent workflows that support stream processing. Micro-workflow is an independent data stream processing service that can be deployed on different layers of the fog computing environment. To validate the feasibility and practicability of the micro-workflow refactoring, we provide intensive experimental analysis evaluating the interval between sensor messages, the time interval required to create a message, between sending sensor message and receiving the message in SWMS, including data serialization, network latency, etc. We show that the proposed decoupling support of the independence of implementation, execution, development, maintenance, and cross-platform deployment, where each micro-workflow becomes a standalone computational unit, is a suitable mechanism for IoT stream processing.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"12 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126196261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes the first-in-the-world attempt to develop an architectural-independent graph framework named VGL, designed for different modern architectures with high-bandwidth memory. Currently VGL supports two classes of architectures: NEC SX-Aurora TSUBASA vector processors and NVIDIA GPUs. However, VGL can be easily extended to other architectures due to its flexible software structure. VGL is designed to provide users with the possibility of selecting the most suitable architecture for solving a specific graph problem on a given input data, which, in return, allows to significantly outperform existing frameworks and libraries, developed for modern multicore CPUs and NVIDIA GPUs. Since VGL uses an identical set of computational and data abstractions for all architectures, its users can easily port graph algorithms between different target architectures without any source code modifications. Additionally, in this paper we show how graph algorithms should be implemented and optimised for NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, demonstrating that both architectures have multiple similar properties and hardware features.
{"title":"Developing an Architecture-independent Graph Framework for Modern Vector Processors and NVIDIA GPUs","authors":"I. Afanasyev","doi":"10.14529/jsfi200404","DOIUrl":"https://doi.org/10.14529/jsfi200404","url":null,"abstract":"This paper describes the first-in-the-world attempt to develop an architectural-independent graph framework named VGL, designed for different modern architectures with high-bandwidth memory. Currently VGL supports two classes of architectures: NEC SX-Aurora TSUBASA vector processors and NVIDIA GPUs. However, VGL can be easily extended to other architectures due to its flexible software structure. VGL is designed to provide users with the possibility of selecting the most suitable architecture for solving a specific graph problem on a given input data, which, in return, allows to significantly outperform existing frameworks and libraries, developed for modern multicore CPUs and NVIDIA GPUs. Since VGL uses an identical set of computational and data abstractions for all architectures, its users can easily port graph algorithms between different target architectures without any source code modifications. Additionally, in this paper we show how graph algorithms should be implemented and optimised for NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, demonstrating that both architectures have multiple similar properties and hardware features.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127098119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper continues the work initiated by the authors on the feasibility of using ParaView as visualization software for the analysis of parallel Computational Fluid Dynamics (CFD) codes’ performance. Current performance tools have limited capacity of displaying their data on top of three-dimensional, framed (i.e., time-stepped) representations of the cluster’s topology. In our first paper, a plugin for the open-source performance tool Score-P was introduced, which intercepts an arbitrary number of manually selected code regions (mostly functions) and send their respective measurements–amount of executions and cumulative time spent–to ParaView (through its in situ library, Catalyst), as if they were any other flow-related variable. Our second paper added to such plugin the capacity to (also) map communication data (messages exchanged between MPI ranks) to the simulation’s geometry. So far the tool was limited to codes which already have the in situ adapter; but in this paper, we will take the performance data and display it–also in codes without in situ–on a three-dimensional representation of the hardware resources being used by the simulation. Testing is done with the Multi-Grid and Block Tri-diagonal NPBs, as well as Rolls-Royce’s CFD code, Hydra. The benefits and overhead of the plugin's new functionalities are discussed.
{"title":"Enhancing the in Situ Visualization of Performance Data in Parallel CFD Applications","authors":"Rigel F. C. Alves, A. Knüpfer","doi":"10.14529/jsfi200402","DOIUrl":"https://doi.org/10.14529/jsfi200402","url":null,"abstract":"This paper continues the work initiated by the authors on the feasibility of using ParaView as visualization software for the analysis of parallel Computational Fluid Dynamics (CFD) codes’ performance. Current performance tools have limited capacity of displaying their data on top of three-dimensional, framed (i.e., time-stepped) representations of the cluster’s topology. In our first paper, a plugin for the open-source performance tool Score-P was introduced, which intercepts an arbitrary number of manually selected code regions (mostly functions) and send their respective measurements–amount of executions and cumulative time spent–to ParaView (through its in situ library, Catalyst), as if they were any other flow-related variable. Our second paper added to such plugin the capacity to (also) map communication data (messages exchanged between MPI ranks) to the simulation’s geometry. So far the tool was limited to codes which already have the in situ adapter; but in this paper, we will take the performance data and display it–also in codes without in situ–on a three-dimensional representation of the hardware resources being used by the simulation. Testing is done with the Multi-Grid and Block Tri-diagonal NPBs, as well as Rolls-Royce’s CFD code, Hydra. The benefits and overhead of the plugin's new functionalities are discussed.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125934271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Druzhilovskiy, L. Stolbov, P. Savosina, P. Pogodin, D. Filimonov, A. Veselovsky, K. Stefanisko, N. Tarasova, M. Nicklaus, V. Poroikov
To improve the discovery of more effective and less toxic pharmaceutical agents, large virtual repositories of synthesizable molecules have been generated to increase the explored chemical-pharmacological space diversity. Such libraries include billions of structural formulae of drug-like molecules associated with data on synthetic schemes, required building blocks, estimated physical-chemical parameters, etc. Clearly, such repositories are “Big Data”. Thus, to identify the most promising compounds with the required pharmacological properties (hits) among billions of available opportunities, special computational methods are necessary. We have proposed using a combined computational approach, which combines structural similarity assessment, machine learning, and molecular modeling. Our approach has been validated in a project aimed at finding new pharmaceutical agents against HIV/AIDS and associated comorbidities from the Synthetically Accessible Virtual Inventory (SAVI), a 1.75 billion compound database. Potential inhibitors of HIV-1 protease and reverse transcriptase and agonists of toll-like receptors and STING, affecting innate immunity, were computationally identified. The activity of the three synthesized compounds has been confirmed in a cell-based assay. These compounds belong to the chemical classes, in which the agonistic effect on TLR 7/8 had not been previously shown. Synthesis and biological testing of several dozens of compounds with predicted antiretroviral activity are currently taking place at the NCI/NIH. We also carried out virtual screening among one billion substances to find compounds potentially possessing anti-SARS-CoV-2 activity. The selected hits' information has been accepted by the European Initiative “JEDI Grand Challenge against COVID-19” for synthesis and further biological evaluation. The possibilities and limitations of the approach are discussed.
{"title":"Computational Approaches To Identify A Hidden Pharmacological Potential In Large Chemical Libraries","authors":"D. Druzhilovskiy, L. Stolbov, P. Savosina, P. Pogodin, D. Filimonov, A. Veselovsky, K. Stefanisko, N. Tarasova, M. Nicklaus, V. Poroikov","doi":"10.14529/jsfi200306","DOIUrl":"https://doi.org/10.14529/jsfi200306","url":null,"abstract":"To improve the discovery of more effective and less toxic pharmaceutical agents, large virtual repositories of synthesizable molecules have been generated to increase the explored chemical-pharmacological space diversity. Such libraries include billions of structural formulae of drug-like molecules associated with data on synthetic schemes, required building blocks, estimated physical-chemical parameters, etc. Clearly, such repositories are “Big Data”. Thus, to identify the most promising compounds with the required pharmacological properties (hits) among billions of available opportunities, special computational methods are necessary. We have proposed using a combined computational approach, which combines structural similarity assessment, machine learning, and molecular modeling. Our approach has been validated in a project aimed at finding new pharmaceutical agents against HIV/AIDS and associated comorbidities from the Synthetically Accessible Virtual Inventory (SAVI), a 1.75 billion compound database. Potential inhibitors of HIV-1 protease and reverse transcriptase and agonists of toll-like receptors and STING, affecting innate immunity, were computationally identified. The activity of the three synthesized compounds has been confirmed in a cell-based assay. These compounds belong to the chemical classes, in which the agonistic effect on TLR 7/8 had not been previously shown. Synthesis and biological testing of several dozens of compounds with predicted antiretroviral activity are currently taking place at the NCI/NIH. We also carried out virtual screening among one billion substances to find compounds potentially possessing anti-SARS-CoV-2 activity. The selected hits' information has been accepted by the European Initiative “JEDI Grand Challenge against COVID-19” for synthesis and further biological evaluation. The possibilities and limitations of the approach are discussed.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"42 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133489586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecular dynamics simulations with the QM(DFT)/MM potentials are utilized to discriminate between reactive and nonreactive complexes of the SARS-CoV-2 main protease and its substrates. Classification of frames along the molecular dynamic trajectories is utilized by analysis of the 2D maps of the Laplacian of electron density. Those are calculated in the plane formed by the carbonyl group of the substrate and a nucleophilic sulfur atom of the cysteine residue that initiates enzymatic reaction. Utilization of the GPU-based DFT code allows fast and accurate simulations with the hybrid functional PBE0 and double-zeta basis set. Exclusion of the polarization functions accelerates the calculations 2-fold, however this does not describe the substrate activation. Larger basis set with d-functions on heavy atoms and p-functions on hydrogen atoms enables to disclose equilibrium between the reactive and nonreactive species along the MD trajectory. The suggested approach can be utilized to choose covalent inhibitors that will readily interact with the catalytic residue of the selected enzyme.
{"title":"Computational Characterization of the Substrate Activation in the Active Site of SARS-CoV-2 Main Protease","authors":"M. Khrenova, V. Tsirelson, A. Nemukhin","doi":"10.14529/jsfi200304","DOIUrl":"https://doi.org/10.14529/jsfi200304","url":null,"abstract":"Molecular dynamics simulations with the QM(DFT)/MM potentials are utilized to discriminate between reactive and nonreactive complexes of the SARS-CoV-2 main protease and its substrates. Classification of frames along the molecular dynamic trajectories is utilized by analysis of the 2D maps of the Laplacian of electron density. Those are calculated in the plane formed by the carbonyl group of the substrate and a nucleophilic sulfur atom of the cysteine residue that initiates enzymatic reaction. Utilization of the GPU-based DFT code allows fast and accurate simulations with the hybrid functional PBE0 and double-zeta basis set. Exclusion of the polarization functions accelerates the calculations 2-fold, however this does not describe the substrate activation. Larger basis set with d-functions on heavy atoms and p-functions on hydrogen atoms enables to disclose equilibrium between the reactive and nonreactive species along the MD trajectory. The suggested approach can be utilized to choose covalent inhibitors that will readily interact with the catalytic residue of the selected enzyme.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124934023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Nemukhin, B. Grigorenko, I. Polyakov, S. Lushchekina
We illustrate modern modeling tools applied in the computational design of drugs acting as covalent inhibitors of enzymes. We take the Main protease (M pro ) from the SARS-CoV-2 virus as an important present-day representative. In this work, we construct a compound capable to block M pro , which is composed of fragments of antimalarial drugs and covalent inhibitors of cysteine proteases. To characterize the mechanism of its interaction with the enzyme, the algorithms based on force fields, including molecular mechanics (MM), molecular dynamics (MD) and molecular docking, as well as quantum-based approaches, including quantum chemistry and quantum mechanics/molecular mechanics (QM/MM) methods, should be applied. The use of supercomputers is indispensably important at least in the latter approach. Its application to enzymes assumes that energies and forces in the active sites are computed using methods of quantum chemistry, whereas the rest of protein matrix is described using conventional force fields. For the proposed compound, containing the benzoisothiazolone fragment and the substitute at the uracil ring, we show that it can form a stable covalently bound adduct with the target enzyme, and thus can be recommended for experimental trials.
{"title":"Computational Modeling of the SARS-CoV-2 Main Protease Inhibition by the Covalent Binding of Prospective Drug Molecules","authors":"A. Nemukhin, B. Grigorenko, I. Polyakov, S. Lushchekina","doi":"10.14529/jsfi200303","DOIUrl":"https://doi.org/10.14529/jsfi200303","url":null,"abstract":"We illustrate modern modeling tools applied in the computational design of drugs acting as covalent inhibitors of enzymes. We take the Main protease (M pro ) from the SARS-CoV-2 virus as an important present-day representative. In this work, we construct a compound capable to block M pro , which is composed of fragments of antimalarial drugs and covalent inhibitors of cysteine proteases. To characterize the mechanism of its interaction with the enzyme, the algorithms based on force fields, including molecular mechanics (MM), molecular dynamics (MD) and molecular docking, as well as quantum-based approaches, including quantum chemistry and quantum mechanics/molecular mechanics (QM/MM) methods, should be applied. The use of supercomputers is indispensably important at least in the latter approach. Its application to enzymes assumes that energies and forces in the active sites are computed using methods of quantum chemistry, whereas the rest of protein matrix is described using conventional force fields. For the proposed compound, containing the benzoisothiazolone fragment and the substitute at the uracil ring, we show that it can form a stable covalently bound adduct with the target enzyme, and thus can be recommended for experimental trials.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128072929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Sulimov, D. Kutov, Anna S. Taschilova, I. Ilin, N. Stolpovskaya, K. Shikhaliev, V. Sulimov
Two stages virtual screening of a database containing several thousand low molecular weight organic compounds is performed with the goal to find inhibitors of SARS-CoV-2 main protease. Overall near 41000 different 3D molecular structures have been generated from the initial molecules taking into account several conformers of most molecules. At the first stage the classical SOL docking program is used to determine most promising candidates to become inhibitors. SOL employs the MMFF94 force field, the genetic algorithm (GA) of the global energy optimization, takes into account the desolvation effect arising upon protein-ligand binding and the internal stress energy of the ligand. Parameters of GA are selected to perform the meticulous global optimization, and for docking of one ligand several hours on one computing core are needed on the average. The main protease model is constructed on the base of the protein structure from the Protein Data Bank complex 6W63. More than 1000 ligands structures have been selected for further postprocessing. The SOL score values of these ligands are more negative than the threshold of –6.3 kcal/mol obtained for the native X77 ligand docking. Subsequent calculation of the protein-ligand binding enthalpy by the PM7 quantum-chemical semiempirical method with COSMO solvent model have narrowed down the number of best candidates. Finally, the diverse set of 20 most perspective candidates for the in vitro validation are selected.
{"title":"In Search of Non-covalent Inhibitors of SARS-CoV-2 Main Protease: Computer Aided Drug Design Using Docking and Quantum Chemistry","authors":"A. Sulimov, D. Kutov, Anna S. Taschilova, I. Ilin, N. Stolpovskaya, K. Shikhaliev, V. Sulimov","doi":"10.14529/jsfi200305","DOIUrl":"https://doi.org/10.14529/jsfi200305","url":null,"abstract":"Two stages virtual screening of a database containing several thousand low molecular weight organic compounds is performed with the goal to find inhibitors of SARS-CoV-2 main protease. Overall near 41000 different 3D molecular structures have been generated from the initial molecules taking into account several conformers of most molecules. At the first stage the classical SOL docking program is used to determine most promising candidates to become inhibitors. SOL employs the MMFF94 force field, the genetic algorithm (GA) of the global energy optimization, takes into account the desolvation effect arising upon protein-ligand binding and the internal stress energy of the ligand. Parameters of GA are selected to perform the meticulous global optimization, and for docking of one ligand several hours on one computing core are needed on the average. The main protease model is constructed on the base of the protein structure from the Protein Data Bank complex 6W63. More than 1000 ligands structures have been selected for further postprocessing. The SOL score values of these ligands are more negative than the threshold of –6.3 kcal/mol obtained for the native X77 ligand docking. Subsequent calculation of the protein-ligand binding enthalpy by the PM7 quantum-chemical semiempirical method with COSMO solvent model have narrowed down the number of best candidates. Finally, the diverse set of 20 most perspective candidates for the in vitro validation are selected.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122303350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This review starts with outlining how science and technology evaluated from last century into high throughput science and technology in modern era due to the Nobel-Prize-level inventions of combinatorial chemistry, polymerase chain reaction, and high-throughput screening. The evolution results in big data accumulated in life sciences and the fields of drug discovery. The big data demands for supercomputing in biology and medicine, although the computing complexity is still a grand challenge for sophisticated biosystems in drug design in this supercomputing era. In order to resolve the real-world issues, artificial intelligence algorithms (specifically machine learning approaches) were introduced, and have demonstrated the power in discovering structure-activity relations hidden in big biochemical data. Particularly, this review summarizes on how people modernize the conventional machine learning algorithms by combing non-numeric pattern recognition and deep learning algorithms, and successfully resolved drug design and high throughput screening issues. The review ends with the perspectives on computational opportunities and challenges in drug discovery by introducing new drug design principles and modeling the process of packing DNA with histones in micrometer scale space, a n example of how a macrocosm object gets into microcosm world.
{"title":"Perspectives on Supercomputing and Artificial Intelligence Applications in Drug Discovery","authors":"Jun Xu, Jiming Ye","doi":"10.14529/jsfi200302","DOIUrl":"https://doi.org/10.14529/jsfi200302","url":null,"abstract":"This review starts with outlining how science and technology evaluated from last century into high throughput science and technology in modern era due to the Nobel-Prize-level inventions of combinatorial chemistry, polymerase chain reaction, and high-throughput screening. The evolution results in big data accumulated in life sciences and the fields of drug discovery. The big data demands for supercomputing in biology and medicine, although the computing complexity is still a grand challenge for sophisticated biosystems in drug design in this supercomputing era. In order to resolve the real-world issues, artificial intelligence algorithms (specifically machine learning approaches) were introduced, and have demonstrated the power in discovering structure-activity relations hidden in big biochemical data. Particularly, this review summarizes on how people modernize the conventional machine learning algorithms by combing non-numeric pattern recognition and deep learning algorithms, and successfully resolved drug design and high throughput screening issues. The review ends with the perspectives on computational opportunities and challenges in drug discovery by introducing new drug design principles and modeling the process of packing DNA with histones in micrometer scale space, a n example of how a macrocosm object gets into microcosm world.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124713675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}