Supercomput. Front. Innov.最新文献

英文中文

Supercomputer Modeling of Dual-Site Acetylcholinesterase (AChE) Inhibition 双位点乙酰胆碱酯酶(AChE)抑制的超级计算机模拟

Supercomput. Front. Innov.

Pub Date : 2018-12-01 DOI: 10.14529/JSFI180410

S. Lushchekina, G. Makhaeva, D. Novichkova, I. Zueva, N. Kovaleva, Rudy R. Richardson

Molecular docking is one of the most popular tools of molecular modeling. However, in certain cases, like development of inhibitors of cholinesterases as therapeutic agents for Alzheimer's disease, there are many aspects, which should be taken into account to achieve accurate docking results. For simple molecular docking with popular software and standard protocols, a personal computer is sucient, however quite often the results are irrelevant. Due to the complex biochemistry and biophysics of cholinesterases, computational research should be supported with quantum mechanics (QM) and molecular dynamics (MD) calculations, what requires the use of supercomputers. Experimental studies of inhibition kinetics can discriminate between dierent types of inhibition—competitive, non-competitive or mixed type—that is quite helpful for assessment of the docking results. Here we consider inhibition of human acetylcholinesterase (AChE) by the conjugate of MB and 2,8-dimethyl-tetrahydro-y-carboline, study its interactions with AChE in relation to the experimental data, and use it as an example to elucidate crucial points for reliable docking studies of bulky AChE inhibitors. Molecular docking results were found to be extremely sensitive to the choice of the X-ray AChE structure for the docking target and the scheme selected for the distribution of partial atomic charges. It was demonstrated that exible docking should be used with an additional caution, because certain protein conformational changes might not correspond with available X-ray and MD data.

分子对接是分子建模中最常用的工具之一。然而，在某些情况下，如开发胆碱酯酶抑制剂作为阿尔茨海默病的治疗剂，有许多方面需要考虑，以达到准确的对接结果。对于与流行软件和标准协议的简单分子对接，个人电脑是足够的，但结果往往是不相关的。由于胆碱酯酶具有复杂的生物化学和生物物理性质，计算研究需要量子力学(QM)和分子动力学(MD)计算的支持，这需要使用超级计算机。抑制动力学的实验研究可以区分不同类型的抑制-竞争性，非竞争性或混合型-这对对接结果的评估很有帮助。本研究考虑MB与2,8-二甲基四氢羰基碱偶联物对人乙酰胆碱酯酶(AChE)的抑制作用，结合实验数据研究其与AChE的相互作用，并以此为例阐明大体积AChE抑制剂可靠对接研究的关键点。发现分子对接结果对对接靶x射线AChE结构的选择和部分原子电荷分布方案的选择极为敏感。研究表明，灵活对接应谨慎使用，因为某些蛋白质构象变化可能与现有的x射线和MD数据不一致。

{"title":"Supercomputer Modeling of Dual-Site Acetylcholinesterase (AChE) Inhibition","authors":"S. Lushchekina, G. Makhaeva, D. Novichkova, I. Zueva, N. Kovaleva, Rudy R. Richardson","doi":"10.14529/JSFI180410","DOIUrl":"https://doi.org/10.14529/JSFI180410","url":null,"abstract":"Molecular docking is one of the most popular tools of molecular modeling. However, in certain cases, like development of inhibitors of cholinesterases as therapeutic agents for Alzheimer's disease, there are many aspects, which should be taken into account to achieve accurate docking results. For simple molecular docking with popular software and standard protocols, a personal computer is sucient, however quite often the results are irrelevant. Due to the complex biochemistry and biophysics of cholinesterases, computational research should be supported with quantum mechanics (QM) and molecular dynamics (MD) calculations, what requires the use of supercomputers. Experimental studies of inhibition kinetics can discriminate between dierent types of inhibition—competitive, non-competitive or mixed type—that is quite helpful for assessment of the docking results. Here we consider inhibition of human acetylcholinesterase (AChE) by the conjugate of MB and 2,8-dimethyl-tetrahydro-y-carboline, study its interactions with AChE in relation to the experimental data, and use it as an example to elucidate crucial points for reliable docking studies of bulky AChE inhibitors. Molecular docking results were found to be extremely sensitive to the choice of the X-ray AChE structure for the docking target and the scheme selected for the distribution of partial atomic charges. It was demonstrated that exible docking should be used with an additional caution, because certain protein conformational changes might not correspond with available X-ray and MD data.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128570332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Autotuning Techniques for Performance-Portable Point Set Registration in 3D 3D中性能便携点集配准的自动调谐技术

Supercomput. Front. Innov.

Pub Date : 2018-11-27 DOI: 10.14529/JSFI180404

P. Luszczek, J. Kurzak, I. Yamazaki, D. Keffer, V. Maroulas, J. Dongarra

We present an autotuning approach applied to exhaustive performance engineering of the EM-ICP algorithm for the point set registration problem with a known reference. We were able to achieve progressively higher performance levels through a variety of code transformations and an automated procedure of generating a large number of implementation variants. Furthermore, we managed to exploit code patterns that are not common when only attempting manual optimization but which yielded in our tests better performance for the chosen registration algorithm. Finally, we also show how we maintained high levels of the performance rate in a portable fashion across a wide range of hardware platforms including multicore, manycore coprocessors, and accelerators. Each of these hardware classes is much different from the others and, consequently, cannot reliably be mastered by a single developer in a short time required to deliver a close-to-optimal implementation. We assert in our concluding remarks that our methodology as well as the presented tools provide a valid automation system for software optimization tasks on modern HPC hardware.

我们提出了一种自动调整方法，应用于EM-ICP算法的详尽性能工程，用于具有已知参考的点集配准问题。通过各种代码转换和生成大量实现变体的自动化过程，我们能够逐步实现更高的性能水平。此外，我们设法利用了仅在尝试手动优化时不常见的代码模式，但在我们的测试中，所选的注册算法产生了更好的性能。最后，我们还展示了如何在各种硬件平台(包括多核、多核协处理器和加速器)上以可移植的方式保持高水平的性能。这些硬件类中的每一个都与其他硬件类大不相同，因此不可能由单个开发人员在短时间内可靠地掌握，从而交付接近最佳的实现。我们在结束语中断言，我们的方法以及所提出的工具为现代HPC硬件上的软件优化任务提供了有效的自动化系统。

引用次数: 0

A Study on Cross-Architectural Modelling of Power Consumption Using Neural Networks 基于神经网络的功耗跨架构建模研究

Supercomput. Front. Innov.

Pub Date : 2018-11-27 DOI: 10.14529/JSFI180403

V. Elisseev, Milos Puzovic, Eun Kyung Lee

On the path to Exascale, the goal of High Performance Computing (HPC) to achieve maximum performance becomes the goal of achieving maximum performance under strict power constraint. Novel approaches to hardware and software co-design of modern HPC systems have to be developed to address such challenges. In this paper, we study prediction of power consumption of HPC systems using metrics obtained from hardware performance counters. We argue that this methodology is portable across different micro architecture implementations and compare results obtained on Intel 64, IBMR and Cavium ThunderXR ARMv8 microarchitectures.We discuss optimal number and type of hardware performance counters required to accurately predict power consumption. We compare accuracy of power predictions provided by models based on Linear Regression (LR) and Neural Networks (NN). We find that the NN-based model provides better accuracy of predictions than the LR model. We also find, that presently it is not yet possible to predict power consumption on a given microarchitecture using data obtained on a different microarchitecture. Results of our work can be used as a starting point for developing unified, cross-architectural models for predicting power consumption.

在通向Exascale的道路上，高性能计算(High Performance Computing, HPC)实现最大性能的目标变成了在严格的功耗约束下实现最大性能的目标。为了应对这些挑战，现代高性能计算系统必须开发新的软硬件协同设计方法。在本文中，我们研究了利用硬件性能计数器获得的指标来预测HPC系统的功耗。我们认为这种方法在不同的微架构实现中是可移植的，并比较了在Intel 64、IBMR和Cavium ThunderXR ARMv8微架构上获得的结果。我们讨论了准确预测功耗所需的硬件性能计数器的最佳数量和类型。我们比较了基于线性回归(LR)和神经网络(NN)的模型提供的功率预测的准确性。我们发现基于神经网络的模型比LR模型提供了更好的预测精度。我们还发现，目前还不可能使用在不同微架构上获得的数据来预测给定微架构上的功耗。我们的工作结果可以用作开发用于预测功耗的统一的跨架构模型的起点。

{"title":"A Study on Cross-Architectural Modelling of Power Consumption Using Neural Networks","authors":"V. Elisseev, Milos Puzovic, Eun Kyung Lee","doi":"10.14529/JSFI180403","DOIUrl":"https://doi.org/10.14529/JSFI180403","url":null,"abstract":"On the path to Exascale, the goal of High Performance Computing (HPC) to achieve maximum performance becomes the goal of achieving maximum performance under strict power constraint. Novel approaches to hardware and software co-design of modern HPC systems have to be developed to address such challenges. In this paper, we study prediction of power consumption of HPC systems using metrics obtained from hardware performance counters. We argue that this methodology is portable across different micro architecture implementations and compare results obtained on Intel 64, IBMR and Cavium ThunderXR ARMv8 microarchitectures.We discuss optimal number and type of hardware performance counters required to accurately predict power consumption. We compare accuracy of power predictions provided by models based on Linear Regression (LR) and Neural Networks (NN). We find that the NN-based model provides better accuracy of predictions than the LR model. We also find, that presently it is not yet possible to predict power consumption on a given microarchitecture using data obtained on a different microarchitecture. Results of our work can be used as a starting point for developing unified, cross-architectural models for predicting power consumption.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122613307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multicore Platform Efficiency Across Remote Sensing Applications 跨遥感应用的多核平台效率

Supercomput. Front. Innov.

Pub Date : 2018-11-27 DOI: 10.14529/JSFI180402

E. Tyutlyaeva, A. Moskovsky, I. Odintsov, S. Konyukhov, A. Poyda, M. Zhizhin, Igor V. Polyakov

A wide range of modern system architectures and platforms targeted for different algorithms and application areas is now available. Even general-purpose systems have advantages in some computation areas and bottlenecks in another. Scientific applications on specific areas, on the other hand, have different requirements for CPU performance, scalability and power consumption. The best practice now is algorithm/architecture co-exploration approach, where scientific problem requirements influence the hardware configuration; on the other hand, algorithm implementation is re factored and optimized in accordance with the platform architectural features. In this research, two typical modules used for multispectral nighttime satellite image processing are studied: • measurement of local perceived sharpness in visible band using the Fourier transform; • cross-correlation in a moving window between visible and infrared bands. Both modules are optimized and studied on wide range of up-to-date testbeds, based on different architectures. Our testbeds include computational nodes based on Intel Xeon E5-2697A v4, Intel Xeon Phi, Texas Instruments Sitara AM5728 dual-core ARM Cortex-A15, and NVIDIA JETSON TX2. The study includes performance testing and energy consumption measurements. The results achieved can be used for assessing serviceability for multispectral nighttime satellite image processing by two key parameters: execution time and energy consumption.

针对不同算法和应用领域的广泛的现代系统架构和平台现在是可用的。即使是通用系统在某些计算领域也有优势，而在另一些领域则存在瓶颈。另一方面，特定领域的科学应用对CPU性能、可扩展性和功耗有不同的要求。现在的最佳实践是算法/架构协同探索方法，其中科学问题需求影响硬件配置;另一方面，根据平台架构特点对算法实现进行重构和优化。在本研究中，研究了用于多光谱夜间卫星图像处理的两个典型模块:•利用傅里叶变换测量可见光波段的局部感知清晰度;•在可见光和红外波段之间的移动窗口中的相互关系。这两个模块都在基于不同架构的最新测试平台上进行了优化和研究。我们的测试平台包括基于英特尔至强E5-2697A v4、英特尔至强Phi、德州仪器Sitara AM5728双核ARM Cortex-A15和NVIDIA JETSON TX2的计算节点。这项研究包括性能测试和能耗测量。所获得的结果可用于通过两个关键参数评估多光谱夜间卫星图像处理的适用性:执行时间和能耗。

{"title":"Multicore Platform Efficiency Across Remote Sensing Applications","authors":"E. Tyutlyaeva, A. Moskovsky, I. Odintsov, S. Konyukhov, A. Poyda, M. Zhizhin, Igor V. Polyakov","doi":"10.14529/JSFI180402","DOIUrl":"https://doi.org/10.14529/JSFI180402","url":null,"abstract":"A wide range of modern system architectures and platforms targeted for different algorithms and application areas is now available. Even general-purpose systems have advantages in some computation areas and bottlenecks in another. Scientific applications on specific areas, on the other hand, have different requirements for CPU performance, scalability and power consumption. The best practice now is algorithm/architecture co-exploration approach, where scientific problem requirements influence the hardware configuration; on the other hand, algorithm implementation is re factored and optimized in accordance with the platform architectural features. In this research, two typical modules used for multispectral nighttime satellite image processing are studied: • measurement of local perceived sharpness in visible band using the Fourier transform; • cross-correlation in a moving window between visible and infrared bands. Both modules are optimized and studied on wide range of up-to-date testbeds, based on different architectures. Our testbeds include computational nodes based on Intel Xeon E5-2697A v4, Intel Xeon Phi, Texas Instruments Sitara AM5728 dual-core ARM Cortex-A15, and NVIDIA JETSON TX2. The study includes performance testing and energy consumption measurements. The results achieved can be used for assessing serviceability for multispectral nighttime satellite image processing by two key parameters: execution time and energy consumption.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"13 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125823486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Flux Splitting Method for the SHTC Model for High-performance Simulations of Two-phase Flows 两相流高性能模拟SHTC模型的通量分裂方法

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/JSFI180315

N. S. Smirnova, M. Dumbser, M. Petrov, Alexander V. Chikitkin, E. Romenski

In this paper we propose a new flux splitting approach for the symmetric hyperbolic thermodynamically compatible (SHTC) equations of compressible two-phase flow which can be used in finite-volume methods. The approach is based on splitting the entire model into acoustic and pseudo-convective submodels. The associated acoustic system is numerically solved applying HLLC-type Riemann solver for its Lagrangian form. The convective part of the pseudo-convective submodel is solved by a standart upwind scheme. For other parts of the pseudo-convective submodel we apply the FORCE method. A comparison is carried out with unsplit methods. Numerical results are obtained on several test problems. Results show good agreement with exact solutions and reference calculations.

本文提出了一种新的可压缩两相流对称双曲热力学相容方程的通量分裂方法，该方法可用于有限体积方法。该方法基于将整个模型分解为声学子模型和伪对流子模型。应用hlc型黎曼求解器对伴生声系统的拉格朗日形式进行了数值求解。拟对流子模型的对流部分采用标准迎风格式求解。对于伪对流子模型的其他部分，我们采用FORCE方法。与未拆分方法进行了比较。对几个测试问题进行了数值计算。计算结果与精确解和参考计算结果吻合较好。

引用次数: 0

Supercomputer Modeling of Parachute Flight Dynamics 降落伞飞行动力学的超级计算机建模

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/jsfi180323

A. Setukha, V. A. Aparinov, A. Aparinov

In this article the authors present parallel implementation of numerical method for computer modeling of dynamics of a parachute with filled canopy. To solve the 3D problem of parachute free motion numerically, authors formulate tied problem of dynamics and aerodynamics where aerodynamic characteristics are found with discrete vortices method on each step of integration in time, and to find motion law the corresponding motion equations have to be solved. The solution of such problems requires high computational resources because it is important to model parachute motion during a long physical time period. Herewith the behavior of vortex wake behind the parachute is important and has to be modeled. In the approach applied by the authors the wake is modeled as a set of flexible vortex elements. So to increase computational efficiency, the authors used methods of low-rank matrix approximations, as well as parallel implementations of algorithms. Short description of numerical method is presented, as well as the examples of numerical modeling.

本文提出了一种数值方法的并行实现方法，用于填充伞盖的降落伞动力学的计算机建模。为了数值求解降落伞自由运动的三维问题，提出了动力学与空气动力学的结合问题，在时间积分的每一步上用离散涡法求出气动特性，并求解相应的运动方程求出运动规律。这类问题的解决需要大量的计算资源，因为模拟降落伞在长时间内的运动是很重要的。因此，降落伞后旋涡尾流的特性是重要的，必须进行模拟。在作者所采用的方法中，尾迹被建模为一组柔性涡单元。因此，为了提高计算效率，作者采用了低秩矩阵逼近的方法，以及算法的并行实现。简要介绍了数值方法，并给出了数值模拟的实例。

引用次数: 4

Multiscale Simulations Approach: Crosslinked Polymer Matrices 多尺度模拟方法:交联聚合物矩阵

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/JSFI180309

P. Komarov, D. Guseva, V. Rudyak, A. Chertovich

Atomistic molecular dynamics simulations can usually cover only a very limited range in space and time. Thus, the materials like polymer resin networks, the properties of which are formed on macroscopic scale, are hard to study thoroughly using only molecular dynamics. Our work presents a multiscale simulation methodology to overcome this shortcoming. To demonstrate its effectiveness, we conducted a study of thermal and mechanical properties of complex polymer matrices and establish a direct correspondence between simulations and experimental results. We believe this methodology can be successfully used for predictive simulations of a broad range of polymer matrices in glassy state.

原子分子动力学模拟通常只能覆盖非常有限的空间和时间范围。因此，聚合物树脂网络等材料的性质是在宏观尺度上形成的，仅用分子动力学很难对其进行深入研究。我们的工作提出了一种多尺度模拟方法来克服这一缺点。为了证明其有效性，我们对复杂聚合物基质的热性能和力学性能进行了研究，并建立了模拟和实验结果之间的直接对应关系。我们相信这种方法可以成功地用于玻璃态聚合物基质的预测模拟。

引用次数: 1

Optimization of BWB Aircraft Using Parallel Computing 基于并行计算的BWB飞机优化设计

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/JSFI180317

K. Anisimov, A. Savelyev, I. A. Kursakov, A. Lysenkov, P. Prakasha

Nacelle shape optimization for Blended Wing Body (BWB) is performed. The optimization procedure is based on numerical calculations of the Reynolds–averaged Navier–Stokes equations. For the Top Level Aircraft Requirements, formulated in AGILE project, the propulsion system was designed. The optimization procedure was divided in two steps. At first step, the isolated nacelle was designed and optimized for cruise regimes. This step is listed in paragraph 3. At second step the nacelles positions over airframe were optimized. To find the optimum solution, surrogate–based Efficient Global Optimization algorithm is used. An automatic structural computational mesh creation is realized for the effective optimization algorithm working. This whole procedure is considered in the context of the third generation multidisciplinary optimization techniques, developed within AGILE project. During the project, new techniques should be implemented for the novel aircraft configurations, chosen as test cases for application of AGILE technologies. It is shown that the optimization technology meets all requirements and is suitable for using in the AGILE project.

对混合翼体(BWB)进行了短舱形状优化。优化过程是基于雷诺平均Navier-Stokes方程的数值计算。针对AGILE项目中制定的Top Level Aircraft Requirements，设计了推进系统。优化过程分为两步。首先，对隔离舱进行了设计和优化，以适应巡航工况。这一步骤列于第3段。第二步，优化机舱在机身上的位置。为了找到最优解，采用了基于代理的高效全局优化算法。实现了结构计算网格的自动生成，使优化算法有效地工作。整个过程是在AGILE项目中开发的第三代多学科优化技术的背景下考虑的。在项目过程中，应针对新飞机配置实施新技术，并选择新技术作为AGILE技术应用的测试用例。结果表明，该优化技术满足所有要求，适合在敏捷项目中使用。

引用次数: 2

Continuum Computing - on a New Performance Trajectory beyond Exascale 连续计算--超越 Exascale 的新性能轨迹

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/JSFI180301

M. Brodowicz, T. Sterling, Matthew Anderson

The end of Moore's Law is a cliche that none the less is a hard barrier to future scaling of high performance computing systems. A factor of about 4x in device density is all that is left of this form of improved throughput with a 5x gain required just to get to the milestone of exascale. The remaining sources of performance improvement are better delivered efficiency of more than 10x and alternative architectures to make better use of chip real estate. This paper will discuss the set of principles guiding a potential future of non-von Neumann architectures as adopted by the experimental class of Continuum Computer Architecture (CCA). It is being explored by the Semantic Memory Architecture Research Team (SMART) at Indiana University. CCA comprises a homogeneous aggregation of cellular components (function cells) which are orders of magnitude smaller than lightweight cores and individually is unable to accomplish a computation but in combination can do so with extreme cost efficiency and unprecedented scalability. It will be seen that a path exists based on such unconventional methods like neuromorphic computing or dataflow that not only will meet the likely exascale milestone in the same time with much better power, cost, and size but also will set a new performance trajectory leading to Zetaflops capability before 2030.

摩尔定律的终结是一个老生常谈的问题，但它仍然是未来高性能计算系统扩展的一个难以逾越的障碍。设备密度提高约 4 倍是这种吞吐量改进形式所剩的全部，而要达到超大规模的里程碑，则需要提高 5 倍。其余的性能改进来源是超过 10 倍的更高交付效率，以及更好地利用芯片空间的替代架构。本文将讨论指导非冯-诺依曼体系结构潜在未来的一系列原则，连续计算机体系结构（CCA）实验类采用了这些原则。印第安纳大学的语义记忆架构研究小组（SMART）正在对该架构进行探索。CCA 由单元组件（功能单元）的同质聚合组成，这些单元组件比轻量级内核小几个数量级，单独使用无法完成计算，但组合使用却能以极高的成本效率和前所未有的可扩展性完成计算。我们将看到，基于神经形态计算或数据流等非常规方法的途径是存在的，它不仅能以更高的功耗、成本和尺寸在同一时间达到可能的超大规模里程碑，还将设定新的性能轨迹，在 2030 年前实现 Zetaflops 能力。

{"title":"Continuum Computing - on a New Performance Trajectory beyond Exascale","authors":"M. Brodowicz, T. Sterling, Matthew Anderson","doi":"10.14529/JSFI180301","DOIUrl":"https://doi.org/10.14529/JSFI180301","url":null,"abstract":"The end of Moore's Law is a cliche that none the less is a hard barrier to future scaling of high performance computing systems. A factor of about 4x in device density is all that is left of this form of improved throughput with a 5x gain required just to get to the milestone of exascale. The remaining sources of performance improvement are better delivered efficiency of more than 10x and alternative architectures to make better use of chip real estate. This paper will discuss the set of principles guiding a potential future of non-von Neumann architectures as adopted by the experimental class of Continuum Computer Architecture (CCA). It is being explored by the Semantic Memory Architecture Research Team (SMART) at Indiana University. CCA comprises a homogeneous aggregation of cellular components (function cells) which are orders of magnitude smaller than lightweight cores and individually is unable to accomplish a computation but in combination can do so with extreme cost efficiency and unprecedented scalability. It will be seen that a path exists based on such unconventional methods like neuromorphic computing or dataflow that not only will meet the likely exascale milestone in the same time with much better power, cost, and size but also will set a new performance trajectory leading to Zetaflops capability before 2030.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"70 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131139217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

GPU-based Implementation of Discrete Element Method for Simulation of the Geological Fault Geometry and Position 基于gpu的地质断层几何位置模拟离散元方法的实现

Supercomput. Front. Innov.

Pub Date : 2018-09-01 DOI: 10.14529/JSFI180307

V. Lisitsa, V. Tcheverda, V. Volianskaia

We present an algorithm for numerical simulation of the geological fault formation. The approach is based on the discrete elements method, which allows modeling of the deformations and structural discontinuity of the Upper part of the Earth crust. In the discrete elements method, the medium is represented as an combination of discrete particles which interact as elastic or viscoelastic bodies. Additionally, external potential forces, for example gravitational forces, may be introduced. At each time step the full set of forces acting at each particle is computed, after that the position of the particle is evaluated on the base of Newtonian mechanics. We implement the algorithm using CUDA technology to simulate single statistical realization of the model, whereas MPI is used to parallelize with respect to different statistical realizations. Obtained numerical results show that for low dip angles of the tectonic displacements relatively narrow faults form, whereas high dip angles of the tectonic displacements lead to a wide V-shaped deformation zones.

提出了一种地质断层形成的数值模拟算法。该方法基于离散元法，可以对地壳上部的变形和结构不连续进行建模。在离散元法中，介质被表示为离散粒子的组合，这些粒子作为弹性或粘弹性体相互作用。此外，可以引入外部势能，例如重力。在每个时间步，计算作用在每个粒子上的全部力，然后根据牛顿力学计算粒子的位置。我们使用CUDA技术实现算法来模拟模型的单个统计实现，而MPI用于相对于不同的统计实现并行化。得到的数值结果表明，在构造位移倾角较低的情况下，断层形成相对狭窄，而在构造位移倾角较大的情况下，断层形成较宽的v型变形带。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Supercomput. Front. Innov.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀