2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文中文

High Performance Implementation of an Econometrics and Financial Application on GPUs 基于gpu的计量经济学和金融应用的高性能实现

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.138

M. Creel, M. Zubair

In this paper, we describe a GPU based implementation for an estimator based on an indirect likelihood inference method. This method relies on simulations from a model and on nonparametric density or regression function computations. The estimation application arises in various domains such as econometrics and finance, when the model is fully specified, but too complex for estimation by maximum likelihood. We implemented the estimator on a machine with two 2.67GHz Intel Xeon X5650 processors and four NVIDIA M2090 GPU devices. We optimized the GPU code by efficient use of shared memory and registers available on the GPU devices. We compared the optimized GPU code performance with a C based sequential version of the code that was executed on the host machine. We observed a speed up factor of up to 242 with four GPU devices.

在本文中，我们描述了一个基于GPU的基于间接似然推理方法的估计器的实现。这种方法依赖于模型的模拟和非参数密度或回归函数的计算。当模型是完全指定的，但是对于最大似然估计来说过于复杂时，估计应用程序出现在诸如计量经济学和金融等各个领域。我们在一台带有两个2.67GHz Intel Xeon X5650处理器和四个NVIDIA M2090 GPU设备的机器上实现了这个估计器。我们通过有效地利用GPU设备上可用的共享内存和寄存器来优化GPU代码。我们将优化后的GPU代码性能与在主机上执行的基于C的顺序版本的代码进行了比较。我们观察到四个GPU设备的加速系数高达242。

引用次数: 14

Integrate Military with Distributed Cloud Computing and Secure Virtualization 将军事与分布式云计算和安全虚拟化相结合

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.145

J. M. Reddy, J. Monika

Cloud computing is known as a novel information technology (IT) concept, which involves facilitated and rapid access to networks, servers, data saving media, applications and services via Internet with minimum hardware requirements. Use of information systems and technologies at the battlefield is not new. Information superiority is a force multiplier and is crucial to mission success. Distributed cloud computing in the Military systems are operational today. In the near future extensive use of military clouds at the battlefield is predicted. Integrating cloud computing logic to military applications will increase the flexibility, cost-effectiveness, efficiency and accessibility capabilities. In this paper, distributed cloud computing concepts are defined. Cloud computing supported battlefield applications are analyzed. The effects of cloud computing systems on the information domain in future warfare are discussed. Battlefield opportunities and novelties which might be introduced by distributed cloud computing systems are researched. The role of military clouds in future warfare is proposed in this paper. It was concluded that military clouds will be indispensible components of the future battlefield. Military clouds have the potential of increasing situational awareness at the battlefield and facilitating the settlement of information superiority.

云计算被称为一种新的信息技术(IT)概念，它涉及通过互联网方便和快速地访问网络、服务器、数据存储介质、应用程序和服务，而硬件要求最低。在战场上使用信息系统和技术并不是什么新鲜事。信息优势是力量倍增器，对任务成功至关重要。分布式云计算在军事系统中的应用已经开始。预计在不久的将来，军事云将在战场上得到广泛应用。将云计算逻辑集成到军事应用将增加灵活性、成本效益、效率和可访问性。本文定义了分布式云计算的概念。分析了云计算支持的战场应用。讨论了未来战争中云计算系统对信息领域的影响。研究了分布式云计算系统可能带来的战场机遇和新颖性。提出了军事云在未来战争中的作用。结论是，军事云将是未来战场不可缺少的组成部分。军事云具有增强战场态势感知和促进信息优势解决的潜力。

{"title":"Integrate Military with Distributed Cloud Computing and Secure Virtualization","authors":"J. M. Reddy, J. Monika","doi":"10.1109/SC.Companion.2012.145","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.145","url":null,"abstract":"Cloud computing is known as a novel information technology (IT) concept, which involves facilitated and rapid access to networks, servers, data saving media, applications and services via Internet with minimum hardware requirements. Use of information systems and technologies at the battlefield is not new. Information superiority is a force multiplier and is crucial to mission success. Distributed cloud computing in the Military systems are operational today. In the near future extensive use of military clouds at the battlefield is predicted. Integrating cloud computing logic to military applications will increase the flexibility, cost-effectiveness, efficiency and accessibility capabilities. In this paper, distributed cloud computing concepts are defined. Cloud computing supported battlefield applications are analyzed. The effects of cloud computing systems on the information domain in future warfare are discussed. Battlefield opportunities and novelties which might be introduced by distributed cloud computing systems are researched. The role of military clouds in future warfare is proposed in this paper. It was concluded that military clouds will be indispensible components of the future battlefield. Military clouds have the potential of increasing situational awareness at the battlefield and facilitating the settlement of information superiority.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"86 1","pages":"1200-1206"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82205997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned 基于Blue Gene/Q的代数多重网格性能建模:经验教训

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.57

Hormozd Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. Yang

The IBM Blue Gene/Q represents a large step in the evolution of massively parallel machines. It features 16-core compute nodes, with additional parallelism in the form of four simultaneous hardware threads per core, connected together by a five-dimensional torus network. Machines are being built with core counts in the hundreds of thousands, with the largest, Sequoia, featuring over 1.5 million cores. In this paper, we develop a performance model for the solve cycle of algebraic multigrid on Blue Gene/Q to help us understand the issues this popular linear solver for large, sparse linear systems faces on this architecture. We validate the model on a Blue Gene/Q at IBM, and conclude with a discussion of the implications of our results.

IBM蓝色基因/Q代表了大规模并行机器进化的一大步。它具有16核计算节点，每个核同时有四个硬件线程，通过一个五维环面网络连接在一起，具有额外的并行性。机器的核心数量达到数十万个，其中最大的红杉(Sequoia)拥有超过150万个核心。在本文中，我们开发了一个在Blue Gene/Q上求解代数多重网格循环的性能模型，以帮助我们理解这种流行的线性求解器在这种架构上面对的大型稀疏线性系统的问题。我们在IBM的Blue Gene/Q上验证了模型，最后讨论了我们的结果的含义。

引用次数: 6

A Python HPC Framework: PyTrilinos, ODIN, and Seamless Python高性能计算框架:PyTrilinos, ODIN和Seamless

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.83

K. W. Smith, W. Spotz, S. Ross-Ross

We present three Python software projects: PyTrilinos, for calling Trilinos distributed memory HPC solvers from Python; Optimized Distributed NumPy (ODIN), for distributed array computing; and Seamless, for automatic, Just-in-time compilation of Python source code. We argue that these three projects in combination provide a framework for high-performance computing in Python. They provide this framework by supplying necessary features (in the case of ODIN and Seamless) and algorithms (in the case of ODIN and PyTrilinos) for a user to develop HPC applications. Together they address the principal limitations (real or imagined) ascribed to Python when applied to high-performance computing. A high-level overview of each project is given, including brief explanations as to how these projects work in conjunction to the benefit of end users.

我们介绍了三个Python软件项目:PyTrilinos，用于从Python调用Trilinos分布式内存HPC求解器;优化的分布式NumPy (ODIN)，用于分布式数组计算;无缝，用于自动，即时编译Python源代码。我们认为这三个项目结合在一起为Python中的高性能计算提供了一个框架。他们通过提供必要的功能(在ODIN和Seamless的情况下)和算法(在ODIN和PyTrilinos的情况下)为用户开发HPC应用程序提供了这个框架。它们一起解决了Python应用于高性能计算时的主要限制(真实的或想象的)。对每个项目进行了高层次的概述，包括简要说明这些项目如何协同工作以造福最终用户。

引用次数: 1

Poster: Matrix Decomposition Based Conjugate Gradient Solver for Poisson Equation 海报:基于矩阵分解的泊松方程共轭梯度求解器

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.287

Hang Liu, J. Seo, R. Mittal

Finding a fast solver for the Poisson equation is important for many scientific applications. In this work, we design and develop a matrix decomposition based Conjugate Gradient (CG) solver, which leverages Graphics Processing Unit (GPU) clusters to accelerate the calculation of the Poisson equation. Our experiments show that the new CG solver is highly scalable and achieves significant speedup over a CPU-based Multi-Grid (MG) solver.

寻找泊松方程的快速求解器对许多科学应用都是重要的。在这项工作中，我们设计并开发了一个基于矩阵分解的共轭梯度(CG)求解器，它利用图形处理单元(GPU)集群来加速泊松方程的计算。我们的实验表明，新的CG求解器具有高度可扩展性，并且比基于cpu的多网格(MG)求解器实现了显着的加速。

引用次数: 3

Integrating Policy with Scientific Workflow Management for Data-Intensive Applications 集成策略与科学工作流管理的数据密集型应用

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.29

A. Chervenak, David E. Smith, Weiwei Chen, E. Deelman

As scientific applications generate and consume data at ever-increasing rates, scientific workflow systems that manage the growing complexity of analyses and data movement will increase in importance. The goal of our work is to improve the overall performance of scientific workflows by using policy to improve data staging into and out of computational resources. We developed a Policy Service that gives advice to the workflow system about how to stage data, including advice on the order of data transfers and on transfer parameters. The Policy Service gives this advice based on its knowledge of ongoing transfers, recent transfer performance, and the current allocation of resources for data staging. The paper describes the architecture of the Policy Service and its integration with the Pegasus Workflow Management System. It employs a range of policies for data staging, and presents performance results for one policy that does a greedy allocation of data transfer streams between source and destination sites. The results show performance improvements for a data-intensive workflow: the Montage astronomy workflow augmented to perform additional large data staging operations.

随着科学应用程序以不断增长的速度生成和使用数据，管理日益复杂的分析和数据移动的科学工作流系统将变得越来越重要。我们的工作目标是通过使用策略来改进进出计算资源的数据分段，从而提高科学工作流的整体性能。我们开发了一个Policy Service，它向工作流系统提供关于如何存放数据的建议，包括关于数据传输顺序和传输参数的建议。Policy Service根据其对正在进行的传输、最近的传输性能和当前用于数据暂存在的资源分配的了解提供此建议。本文描述了策略服务的体系结构及其与Pegasus工作流管理系统的集成。它采用了一系列策略进行数据暂放，并给出了一个策略的性能结果，该策略在源站点和目标站点之间贪婪地分配数据传输流。结果显示了数据密集型工作流的性能改进:增强了Montage天文学工作流以执行额外的大数据分段操作。

{"title":"Integrating Policy with Scientific Workflow Management for Data-Intensive Applications","authors":"A. Chervenak, David E. Smith, Weiwei Chen, E. Deelman","doi":"10.1109/SC.Companion.2012.29","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.29","url":null,"abstract":"As scientific applications generate and consume data at ever-increasing rates, scientific workflow systems that manage the growing complexity of analyses and data movement will increase in importance. The goal of our work is to improve the overall performance of scientific workflows by using policy to improve data staging into and out of computational resources. We developed a Policy Service that gives advice to the workflow system about how to stage data, including advice on the order of data transfers and on transfer parameters. The Policy Service gives this advice based on its knowledge of ongoing transfers, recent transfer performance, and the current allocation of resources for data staging. The paper describes the architecture of the Policy Service and its integration with the Pegasus Workflow Management System. It employs a range of policies for data staging, and presents performance results for one policy that does a greedy allocation of data transfer streams between source and destination sites. The results show performance improvements for a data-intensive workflow: the Montage astronomy workflow augmented to perform additional large data staging operations.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"28 1","pages":"140-149"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90303692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Philosophy 301: But Can You "Handle the Truth"? 哲学301:但你能“面对真相”吗?

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.124

Nicolas Dubé

This presentation debunks three "truths" as seen from Plato's cave: the untold story of PUE, clean coal, and water is free and available.

这个演讲揭穿了从柏拉图的洞穴中看到的三个“真理”:PUE的不为人知的故事，清洁煤，水是免费的。

引用次数: 0

Application performance characterization and analysis on Blue Gene/Q Blue Gene/Q应用性能表征与分析

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.358

B. Walkup

This article consists of a collection of slides from the author's conference presentation. The author concludes that The Blue Gene/Q design, low-power simple cores, four hardware threads per core, resu lts in high instruction throughput, and thus exceptional power efficiency for applications. Can effectively fill in pipeline stalls and hide latencies in the memory subsystem. The consequence is low performance per thread, so a high degree of parallelization is required for high application performance. Traditional programming methods (MPI, OpenMP, Pthreads) hold up at very large scales. Memory costs can limit scaling when there are data-structures with size linear in the number of processes, threading helps by keeping the number of processes manageable. Detailed performance analysis is viable at > 10^6 processes but requires care. On-the-fly performance data reduction has merits.

本文由作者在会议上的演讲幻灯片组成。作者得出结论:Blue Gene/Q设计，低功耗的简单内核，每核四个硬件线程，导致高指令吞吐量，从而为应用程序提供卓越的功耗效率。可以有效地填补管道的停顿和隐藏内存子系统的延迟。其结果是每个线程的性能较低，因此需要高度的并行化来获得较高的应用程序性能。传统的编程方法(MPI、OpenMP、Pthreads)适用于非常大的规模。当数据结构的大小与进程数量呈线性关系时，内存成本可能会限制扩展，线程可以帮助保持进程数量的可管理性。详细的性能分析在bbb10 ^6进程中是可行的，但需要注意。动态性能数据缩减有其优点。

引用次数: 4

Explosive Charge Blowing a Hole in a Steel Plate Animation 炸药在钢板上炸出一个洞动画

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.364

Bradley Carvey, Nathan Fabian, D. Rogers

The animation shows a simulation of an explosive charge, blowing a hold in a steel plate. The simulation data was generated on Sandia National Lab's Red Sky Supercomputer. ParaView was used to export polygonal data, which was then textured and rendered using a commercial 3d rendering package.

该动画模拟了一个爆炸装置，在钢板上吹出一个洞。模拟数据是由桑迪亚国家实验室的红天超级计算机生成的。ParaView用于导出多边形数据，然后使用商业3d渲染包对其进行纹理和渲染。

引用次数: 0

Trace Driven Data Structure Transformations 跟踪驱动的数据结构转换

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.65

T. Janjusic, K. Kavi, Christos Kartsaklis

As the complexity of scientific codes and computational hardware increases it is increasingly important to study the effects of data-structure layouts on program memory behavior. Program structure layouts affect the memory performance differently, therefore we need the capability to effectively study such transformations without the need to rewrite application codes. Trace-driven simulations are an effective and convenient mechanism to simulate program behavior at various granularities. During an application's execution, a tool known as a tracer or profiler, collects program flow data and records program instructions. The trace-file consists of tuples that associate each program instruction with program internal variables. In this paper we outline a proof-of-concept mechanism to apply data-structure transformations during trace simulation and observe effects on memory without the need to manually transform an application's code.

随着科学代码和计算硬件复杂性的增加，研究数据结构布局对程序内存行为的影响变得越来越重要。程序结构布局对内存性能的影响不同，因此我们需要在不重写应用程序代码的情况下有效地研究这种转换的能力。跟踪驱动仿真是一种在不同粒度上模拟程序行为的有效且方便的机制。在应用程序的执行过程中，一个被称为跟踪器或分析器的工具收集程序流数据并记录程序指令。跟踪文件由元组组成，这些元组将每个程序指令与程序内部变量关联起来。在本文中，我们概述了一种概念验证机制，可以在跟踪模拟期间应用数据结构转换并观察对内存的影响，而无需手动转换应用程序的代码。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀