S. Lushchekina, G. Makhaeva, D. Novichkova, I. Zueva, N. Kovaleva, Rudy R. Richardson
Molecular docking is one of the most popular tools of molecular modeling. However, in certain cases, like development of inhibitors of cholinesterases as therapeutic agents for Alzheimer's disease, there are many aspects, which should be taken into account to achieve accurate docking results. For simple molecular docking with popular software and standard protocols, a personal computer is sucient, however quite often the results are irrelevant. Due to the complex biochemistry and biophysics of cholinesterases, computational research should be supported with quantum mechanics (QM) and molecular dynamics (MD) calculations, what requires the use of supercomputers. Experimental studies of inhibition kinetics can discriminate between dierent types of inhibition—competitive, non-competitive or mixed type—that is quite helpful for assessment of the docking results. Here we consider inhibition of human acetylcholinesterase (AChE) by the conjugate of MB and 2,8-dimethyl-tetrahydro-y-carboline, study its interactions with AChE in relation to the experimental data, and use it as an example to elucidate crucial points for reliable docking studies of bulky AChE inhibitors. Molecular docking results were found to be extremely sensitive to the choice of the X-ray AChE structure for the docking target and the scheme selected for the distribution of partial atomic charges. It was demonstrated that exible docking should be used with an additional caution, because certain protein conformational changes might not correspond with available X-ray and MD data.
{"title":"Supercomputer Modeling of Dual-Site Acetylcholinesterase (AChE) Inhibition","authors":"S. Lushchekina, G. Makhaeva, D. Novichkova, I. Zueva, N. Kovaleva, Rudy R. Richardson","doi":"10.14529/JSFI180410","DOIUrl":"https://doi.org/10.14529/JSFI180410","url":null,"abstract":"Molecular docking is one of the most popular tools of molecular modeling. However, in certain cases, like development of inhibitors of cholinesterases as therapeutic agents for Alzheimer's disease, there are many aspects, which should be taken into account to achieve accurate docking results. For simple molecular docking with popular software and standard protocols, a personal computer is sucient, however quite often the results are irrelevant. Due to the complex biochemistry and biophysics of cholinesterases, computational research should be supported with quantum mechanics (QM) and molecular dynamics (MD) calculations, what requires the use of supercomputers. Experimental studies of inhibition kinetics can discriminate between dierent types of inhibition—competitive, non-competitive or mixed type—that is quite helpful for assessment of the docking results. Here we consider inhibition of human acetylcholinesterase (AChE) by the conjugate of MB and 2,8-dimethyl-tetrahydro-y-carboline, study its interactions with AChE in relation to the experimental data, and use it as an example to elucidate crucial points for reliable docking studies of bulky AChE inhibitors. Molecular docking results were found to be extremely sensitive to the choice of the X-ray AChE structure for the docking target and the scheme selected for the distribution of partial atomic charges. It was demonstrated that exible docking should be used with an additional caution, because certain protein conformational changes might not correspond with available X-ray and MD data.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128570332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Luszczek, J. Kurzak, I. Yamazaki, D. Keffer, V. Maroulas, J. Dongarra
We present an autotuning approach applied to exhaustive performance engineering of the EM-ICP algorithm for the point set registration problem with a known reference. We were able to achieve progressively higher performance levels through a variety of code transformations and an automated procedure of generating a large number of implementation variants. Furthermore, we managed to exploit code patterns that are not common when only attempting manual optimization but which yielded in our tests better performance for the chosen registration algorithm. Finally, we also show how we maintained high levels of the performance rate in a portable fashion across a wide range of hardware platforms including multicore, manycore coprocessors, and accelerators. Each of these hardware classes is much different from the others and, consequently, cannot reliably be mastered by a single developer in a short time required to deliver a close-to-optimal implementation. We assert in our concluding remarks that our methodology as well as the presented tools provide a valid automation system for software optimization tasks on modern HPC hardware.
{"title":"Autotuning Techniques for Performance-Portable Point Set Registration in 3D","authors":"P. Luszczek, J. Kurzak, I. Yamazaki, D. Keffer, V. Maroulas, J. Dongarra","doi":"10.14529/JSFI180404","DOIUrl":"https://doi.org/10.14529/JSFI180404","url":null,"abstract":"We present an autotuning approach applied to exhaustive performance engineering of the EM-ICP algorithm for the point set registration problem with a known reference. We were able to achieve progressively higher performance levels through a variety of code transformations and an automated procedure of generating a large number of implementation variants. Furthermore, we managed to exploit code patterns that are not common when only attempting manual optimization but which yielded in our tests better performance for the chosen registration algorithm. Finally, we also show how we maintained high levels of the performance rate in a portable fashion across a wide range of hardware platforms including multicore, manycore coprocessors, and accelerators. Each of these hardware classes is much different from the others and, consequently, cannot reliably be mastered by a single developer in a short time required to deliver a close-to-optimal implementation. We assert in our concluding remarks that our methodology as well as the presented tools provide a valid automation system for software optimization tasks on modern HPC hardware.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126008155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
On the path to Exascale, the goal of High Performance Computing (HPC) to achieve maximum performance becomes the goal of achieving maximum performance under strict power constraint. Novel approaches to hardware and software co-design of modern HPC systems have to be developed to address such challenges. In this paper, we study prediction of power consumption of HPC systems using metrics obtained from hardware performance counters. We argue that this methodology is portable across different micro architecture implementations and compare results obtained on Intel 64, IBMR and Cavium ThunderXR ARMv8 microarchitectures.We discuss optimal number and type of hardware performance counters required to accurately predict power consumption. We compare accuracy of power predictions provided by models based on Linear Regression (LR) and Neural Networks (NN). We find that the NN-based model provides better accuracy of predictions than the LR model. We also find, that presently it is not yet possible to predict power consumption on a given microarchitecture using data obtained on a different microarchitecture. Results of our work can be used as a starting point for developing unified, cross-architectural models for predicting power consumption.
{"title":"A Study on Cross-Architectural Modelling of Power Consumption Using Neural Networks","authors":"V. Elisseev, Milos Puzovic, Eun Kyung Lee","doi":"10.14529/JSFI180403","DOIUrl":"https://doi.org/10.14529/JSFI180403","url":null,"abstract":"On the path to Exascale, the goal of High Performance Computing (HPC) to achieve maximum performance becomes the goal of achieving maximum performance under strict power constraint. Novel approaches to hardware and software co-design of modern HPC systems have to be developed to address such challenges. In this paper, we study prediction of power consumption of HPC systems using metrics obtained from hardware performance counters. We argue that this methodology is portable across different micro architecture implementations and compare results obtained on Intel 64, IBMR and Cavium ThunderXR ARMv8 microarchitectures.We discuss optimal number and type of hardware performance counters required to accurately predict power consumption. We compare accuracy of power predictions provided by models based on Linear Regression (LR) and Neural Networks (NN). We find that the NN-based model provides better accuracy of predictions than the LR model. We also find, that presently it is not yet possible to predict power consumption on a given microarchitecture using data obtained on a different microarchitecture. Results of our work can be used as a starting point for developing unified, cross-architectural models for predicting power consumption.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122613307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Tyutlyaeva, A. Moskovsky, I. Odintsov, S. Konyukhov, A. Poyda, M. Zhizhin, Igor V. Polyakov
A wide range of modern system architectures and platforms targeted for different algorithms and application areas is now available. Even general-purpose systems have advantages in some computation areas and bottlenecks in another. Scientific applications on specific areas, on the other hand, have different requirements for CPU performance, scalability and power consumption. The best practice now is algorithm/architecture co-exploration approach, where scientific problem requirements influence the hardware configuration; on the other hand, algorithm implementation is re factored and optimized in accordance with the platform architectural features. In this research, two typical modules used for multispectral nighttime satellite image processing are studied: • measurement of local perceived sharpness in visible band using the Fourier transform; • cross-correlation in a moving window between visible and infrared bands. Both modules are optimized and studied on wide range of up-to-date testbeds, based on different architectures. Our testbeds include computational nodes based on Intel Xeon E5-2697A v4, Intel Xeon Phi, Texas Instruments Sitara AM5728 dual-core ARM Cortex-A15, and NVIDIA JETSON TX2. The study includes performance testing and energy consumption measurements. The results achieved can be used for assessing serviceability for multispectral nighttime satellite image processing by two key parameters: execution time and energy consumption.
{"title":"Multicore Platform Efficiency Across Remote Sensing Applications","authors":"E. Tyutlyaeva, A. Moskovsky, I. Odintsov, S. Konyukhov, A. Poyda, M. Zhizhin, Igor V. Polyakov","doi":"10.14529/JSFI180402","DOIUrl":"https://doi.org/10.14529/JSFI180402","url":null,"abstract":"A wide range of modern system architectures and platforms targeted for different algorithms and application areas is now available. Even general-purpose systems have advantages in some computation areas and bottlenecks in another. Scientific applications on specific areas, on the other hand, have different requirements for CPU performance, scalability and power consumption. The best practice now is algorithm/architecture co-exploration approach, where scientific problem requirements influence the hardware configuration; on the other hand, algorithm implementation is re factored and optimized in accordance with the platform architectural features. In this research, two typical modules used for multispectral nighttime satellite image processing are studied: • measurement of local perceived sharpness in visible band using the Fourier transform; • cross-correlation in a moving window between visible and infrared bands. Both modules are optimized and studied on wide range of up-to-date testbeds, based on different architectures. Our testbeds include computational nodes based on Intel Xeon E5-2697A v4, Intel Xeon Phi, Texas Instruments Sitara AM5728 dual-core ARM Cortex-A15, and NVIDIA JETSON TX2. The study includes performance testing and energy consumption measurements. The results achieved can be used for assessing serviceability for multispectral nighttime satellite image processing by two key parameters: execution time and energy consumption.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"13 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125823486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. S. Smirnova, M. Dumbser, M. Petrov, Alexander V. Chikitkin, E. Romenski
In this paper we propose a new flux splitting approach for the symmetric hyperbolic thermodynamically compatible (SHTC) equations of compressible two-phase flow which can be used in finite-volume methods. The approach is based on splitting the entire model into acoustic and pseudo-convective submodels. The associated acoustic system is numerically solved applying HLLC-type Riemann solver for its Lagrangian form. The convective part of the pseudo-convective submodel is solved by a standart upwind scheme. For other parts of the pseudo-convective submodel we apply the FORCE method. A comparison is carried out with unsplit methods. Numerical results are obtained on several test problems. Results show good agreement with exact solutions and reference calculations.
{"title":"A Flux Splitting Method for the SHTC Model for High-performance Simulations of Two-phase Flows","authors":"N. S. Smirnova, M. Dumbser, M. Petrov, Alexander V. Chikitkin, E. Romenski","doi":"10.14529/JSFI180315","DOIUrl":"https://doi.org/10.14529/JSFI180315","url":null,"abstract":"In this paper we propose a new flux splitting approach for the symmetric hyperbolic thermodynamically compatible (SHTC) equations of compressible two-phase flow which can be used in finite-volume methods. The approach is based on splitting the entire model into acoustic and pseudo-convective submodels. The associated acoustic system is numerically solved applying HLLC-type Riemann solver for its Lagrangian form. The convective part of the pseudo-convective submodel is solved by a standart upwind scheme. For other parts of the pseudo-convective submodel we apply the FORCE method. A comparison is carried out with unsplit methods. Numerical results are obtained on several test problems. Results show good agreement with exact solutions and reference calculations.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124019008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article the authors present parallel implementation of numerical method for computer modeling of dynamics of a parachute with filled canopy. To solve the 3D problem of parachute free motion numerically, authors formulate tied problem of dynamics and aerodynamics where aerodynamic characteristics are found with discrete vortices method on each step of integration in time, and to find motion law the corresponding motion equations have to be solved. The solution of such problems requires high computational resources because it is important to model parachute motion during a long physical time period. Herewith the behavior of vortex wake behind the parachute is important and has to be modeled. In the approach applied by the authors the wake is modeled as a set of flexible vortex elements. So to increase computational efficiency, the authors used methods of low-rank matrix approximations, as well as parallel implementations of algorithms. Short description of numerical method is presented, as well as the examples of numerical modeling.
{"title":"Supercomputer Modeling of Parachute Flight Dynamics","authors":"A. Setukha, V. A. Aparinov, A. Aparinov","doi":"10.14529/jsfi180323","DOIUrl":"https://doi.org/10.14529/jsfi180323","url":null,"abstract":"In this article the authors present parallel implementation of numerical method for computer modeling of dynamics of a parachute with filled canopy. To solve the 3D problem of parachute free motion numerically, authors formulate tied problem of dynamics and aerodynamics where aerodynamic characteristics are found with discrete vortices method on each step of integration in time, and to find motion law the corresponding motion equations have to be solved. The solution of such problems requires high computational resources because it is important to model parachute motion during a long physical time period. Herewith the behavior of vortex wake behind the parachute is important and has to be modeled. In the approach applied by the authors the wake is modeled as a set of flexible vortex elements. So to increase computational efficiency, the authors used methods of low-rank matrix approximations, as well as parallel implementations of algorithms. Short description of numerical method is presented, as well as the examples of numerical modeling.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125947172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Atomistic molecular dynamics simulations can usually cover only a very limited range in space and time. Thus, the materials like polymer resin networks, the properties of which are formed on macroscopic scale, are hard to study thoroughly using only molecular dynamics. Our work presents a multiscale simulation methodology to overcome this shortcoming. To demonstrate its effectiveness, we conducted a study of thermal and mechanical properties of complex polymer matrices and establish a direct correspondence between simulations and experimental results. We believe this methodology can be successfully used for predictive simulations of a broad range of polymer matrices in glassy state.
{"title":"Multiscale Simulations Approach: Crosslinked Polymer Matrices","authors":"P. Komarov, D. Guseva, V. Rudyak, A. Chertovich","doi":"10.14529/JSFI180309","DOIUrl":"https://doi.org/10.14529/JSFI180309","url":null,"abstract":"Atomistic molecular dynamics simulations can usually cover only a very limited range in space and time. Thus, the materials like polymer resin networks, the properties of which are formed on macroscopic scale, are hard to study thoroughly using only molecular dynamics. Our work presents a multiscale simulation methodology to overcome this shortcoming. To demonstrate its effectiveness, we conducted a study of thermal and mechanical properties of complex polymer matrices and establish a direct correspondence between simulations and experimental results. We believe this methodology can be successfully used for predictive simulations of a broad range of polymer matrices in glassy state.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124851030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Anisimov, A. Savelyev, I. A. Kursakov, A. Lysenkov, P. Prakasha
Nacelle shape optimization for Blended Wing Body (BWB) is performed. The optimization procedure is based on numerical calculations of the Reynolds–averaged Navier–Stokes equations. For the Top Level Aircraft Requirements, formulated in AGILE project, the propulsion system was designed. The optimization procedure was divided in two steps. At first step, the isolated nacelle was designed and optimized for cruise regimes. This step is listed in paragraph 3. At second step the nacelles positions over airframe were optimized. To find the optimum solution, surrogate–based Efficient Global Optimization algorithm is used. An automatic structural computational mesh creation is realized for the effective optimization algorithm working. This whole procedure is considered in the context of the third generation multidisciplinary optimization techniques, developed within AGILE project. During the project, new techniques should be implemented for the novel aircraft configurations, chosen as test cases for application of AGILE technologies. It is shown that the optimization technology meets all requirements and is suitable for using in the AGILE project.
{"title":"Optimization of BWB Aircraft Using Parallel Computing","authors":"K. Anisimov, A. Savelyev, I. A. Kursakov, A. Lysenkov, P. Prakasha","doi":"10.14529/JSFI180317","DOIUrl":"https://doi.org/10.14529/JSFI180317","url":null,"abstract":"Nacelle shape optimization for Blended Wing Body (BWB) is performed. The optimization procedure is based on numerical calculations of the Reynolds–averaged Navier–Stokes equations. For the Top Level Aircraft Requirements, formulated in AGILE project, the propulsion system was designed. The optimization procedure was divided in two steps. At first step, the isolated nacelle was designed and optimized for cruise regimes. This step is listed in paragraph 3. At second step the nacelles positions over airframe were optimized. To find the optimum solution, surrogate–based Efficient Global Optimization algorithm is used. An automatic structural computational mesh creation is realized for the effective optimization algorithm working. This whole procedure is considered in the context of the third generation multidisciplinary optimization techniques, developed within AGILE project. During the project, new techniques should be implemented for the novel aircraft configurations, chosen as test cases for application of AGILE technologies. It is shown that the optimization technology meets all requirements and is suitable for using in the AGILE project.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126211148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The end of Moore's Law is a cliche that none the less is a hard barrier to future scaling of high performance computing systems. A factor of about 4x in device density is all that is left of this form of improved throughput with a 5x gain required just to get to the milestone of exascale. The remaining sources of performance improvement are better delivered efficiency of more than 10x and alternative architectures to make better use of chip real estate. This paper will discuss the set of principles guiding a potential future of non-von Neumann architectures as adopted by the experimental class of Continuum Computer Architecture (CCA). It is being explored by the Semantic Memory Architecture Research Team (SMART) at Indiana University. CCA comprises a homogeneous aggregation of cellular components (function cells) which are orders of magnitude smaller than lightweight cores and individually is unable to accomplish a computation but in combination can do so with extreme cost efficiency and unprecedented scalability. It will be seen that a path exists based on such unconventional methods like neuromorphic computing or dataflow that not only will meet the likely exascale milestone in the same time with much better power, cost, and size but also will set a new performance trajectory leading to Zetaflops capability before 2030.
{"title":"Continuum Computing - on a New Performance Trajectory beyond Exascale","authors":"M. Brodowicz, T. Sterling, Matthew Anderson","doi":"10.14529/JSFI180301","DOIUrl":"https://doi.org/10.14529/JSFI180301","url":null,"abstract":"The end of Moore's Law is a cliche that none the less is a hard barrier to future scaling of high performance computing systems. A factor of about 4x in device density is all that is left of this form of improved throughput with a 5x gain required just to get to the milestone of exascale. The remaining sources of performance improvement are better delivered efficiency of more than 10x and alternative architectures to make better use of chip real estate. This paper will discuss the set of principles guiding a potential future of non-von Neumann architectures as adopted by the experimental class of Continuum Computer Architecture (CCA). It is being explored by the Semantic Memory Architecture Research Team (SMART) at Indiana University. CCA comprises a homogeneous aggregation of cellular components (function cells) which are orders of magnitude smaller than lightweight cores and individually is unable to accomplish a computation but in combination can do so with extreme cost efficiency and unprecedented scalability. It will be seen that a path exists based on such unconventional methods like neuromorphic computing or dataflow that not only will meet the likely exascale milestone in the same time with much better power, cost, and size but also will set a new performance trajectory leading to Zetaflops capability before 2030.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"70 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131139217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present an algorithm for numerical simulation of the geological fault formation. The approach is based on the discrete elements method, which allows modeling of the deformations and structural discontinuity of the Upper part of the Earth crust. In the discrete elements method, the medium is represented as an combination of discrete particles which interact as elastic or viscoelastic bodies. Additionally, external potential forces, for example gravitational forces, may be introduced. At each time step the full set of forces acting at each particle is computed, after that the position of the particle is evaluated on the base of Newtonian mechanics. We implement the algorithm using CUDA technology to simulate single statistical realization of the model, whereas MPI is used to parallelize with respect to different statistical realizations. Obtained numerical results show that for low dip angles of the tectonic displacements relatively narrow faults form, whereas high dip angles of the tectonic displacements lead to a wide V-shaped deformation zones.
{"title":"GPU-based Implementation of Discrete Element Method for Simulation of the Geological Fault Geometry and Position","authors":"V. Lisitsa, V. Tcheverda, V. Volianskaia","doi":"10.14529/JSFI180307","DOIUrl":"https://doi.org/10.14529/JSFI180307","url":null,"abstract":"We present an algorithm for numerical simulation of the geological fault formation. The approach is based on the discrete elements method, which allows modeling of the deformations and structural discontinuity of the Upper part of the Earth crust. In the discrete elements method, the medium is represented as an combination of discrete particles which interact as elastic or viscoelastic bodies. Additionally, external potential forces, for example gravitational forces, may be introduced. At each time step the full set of forces acting at each particle is computed, after that the position of the particle is evaluated on the base of Newtonian mechanics. We implement the algorithm using CUDA technology to simulate single statistical realization of the model, whereas MPI is used to parallelize with respect to different statistical realizations. Obtained numerical results show that for low dip angles of the tectonic displacements relatively narrow faults form, whereas high dip angles of the tectonic displacements lead to a wide V-shaped deformation zones.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121201973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}