2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文中文

Breadth First Search on APEnet+ 广度优先搜索APEnet+

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.41

M. Bernaschi, M. Bisson, Enrico Mastrostefano, D. Rossetti

We present preliminary results of a multi-GPU code for exploring large graphs (hundreds of millions vertices and billions of edges) by using the Breadth First Search algorithm. The GPU hosts are connected by APEnet+, a custom interconnection network that has full support for NVIDIA GPUDirect peer-topeer communication, i.e. the technology allowing a third party device to directly access the GPU memory over the PCI express bus.

我们提出了一个多gpu代码的初步结果，通过使用广度优先搜索算法来探索大型图(数亿个顶点和数十亿条边)。GPU主机通过APEnet+连接，APEnet+是一个定制的互连网络，完全支持NVIDIA GPUDirect对等通信，即允许第三方设备通过PCI快速总线直接访问GPU内存的技术。

引用次数: 10

Optimus: A Parallel Optimization Framework with Topology Aware PSO and Applications 一个具有拓扑感知PSO的并行优化框架及其应用

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.303

S. Sreepathi

This research presents a parallel metaheuristic optimization framework, Optimus (Optimization Methods for Universal Simulators) for integration of a desired population-based search method with a target scientific application. Optimus includes a parallel middleware component, PRIME (Parallel Reconfigurable Iterative Middleware Engine) for scalable deployment on emergent supercomputing architectures. Additionally, we designed TAPSO (Topology Aware Particle Swarm Optimization) for network based optimization problems and applied it to achieve better convergence for water distribution system (WDS) applications. The framework supports concurrent optimization instances, for instance multiple swarms in the case of PSO. PRIME provides a lightweight communication layer to facilitate periodic inter-optimizer data exchanges. We performed scalability analysis of Optimus on Cray XK6(Jaguar) at Oak Ridge Leadership Computing Facility for the leak detection problem in WDS. For a weak scaling scenario, we achieved 84.82% of baseline at 200,000 cores relative to performance at 1000 cores and 72.84% relative to one core scenario.

本研究提出了一个并行的元启发式优化框架，Optimus(通用模拟器优化方法)，用于集成基于期望人群的搜索方法和目标科学应用。Optimus包含一个并行中间件组件，PRIME(并行可重构迭代中间件引擎)，用于在紧急超级计算架构上进行可伸缩部署。此外，我们针对网络优化问题设计了拓扑感知粒子群优化算法(TAPSO)，并将其应用于配水系统(WDS)应用中，以达到更好的收敛性。该框架支持并发优化实例，例如PSO中的多个集群。PRIME提供了一个轻量级的通信层来促进周期性的优化器间数据交换。针对WDS中的泄漏检测问题，我们在Oak Ridge Leadership Computing Facility的Cray XK6(Jaguar)上对Optimus进行了可伸缩性分析。对于弱扩展场景，相对于1000核，我们在20万核时实现了84.82%的基准性能，相对于1核场景，我们实现了72.84%的基准性能。

{"title":"Optimus: A Parallel Optimization Framework with Topology Aware PSO and Applications","authors":"S. Sreepathi","doi":"10.1109/SC.Companion.2012.303","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.303","url":null,"abstract":"This research presents a parallel metaheuristic optimization framework, Optimus (Optimization Methods for Universal Simulators) for integration of a desired population-based search method with a target scientific application. Optimus includes a parallel middleware component, PRIME (Parallel Reconfigurable Iterative Middleware Engine) for scalable deployment on emergent supercomputing architectures. Additionally, we designed TAPSO (Topology Aware Particle Swarm Optimization) for network based optimization problems and applied it to achieve better convergence for water distribution system (WDS) applications. The framework supports concurrent optimization instances, for instance multiple swarms in the case of PSO. PRIME provides a lightweight communication layer to facilitate periodic inter-optimizer data exchanges. We performed scalability analysis of Optimus on Cray XK6(Jaguar) at Oak Ridge Leadership Computing Facility for the leak detection problem in WDS. For a weak scaling scenario, we achieved 84.82% of baseline at 200,000 cores relative to performance at 1000 cores and 72.84% relative to one core scenario.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"6 1","pages":"1524-1525"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75051942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Poster: Evaluation Topology Mapping via Graph Partitioning 海报:通过图分区评估拓扑映射

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.197

Anshu Arya, T. Gamblin, B. Supinski, L. Kalé

Intelligently mapping applications to machine network topologies has been shown to improve performance, but considerable developer effort is required to find good mappings. Techniques from graph partitioning have the potential to automate topology mapping and relieve the developer burden. Graph partitioning is already used for load balancing parallel applications, but can be applied to topology mapping as well. We show performance gains by using a topology-targeting graph partitioner to map sparse matrix-vector and volumetric 3-D FFT kernels onto a 3-D torus network.

将应用程序智能地映射到机器网络拓扑已经被证明可以提高性能，但是要找到好的映射需要开发人员付出相当大的努力。图分区技术具有自动化拓扑映射和减轻开发人员负担的潜力。图分区已经用于并行应用程序的负载平衡，但也可以应用于拓扑映射。我们通过使用拓扑目标图分区器将稀疏矩阵向量和体积三维FFT核映射到三维环面网络来展示性能提升。

引用次数: 0

Poster: Planewave-Based First-Principles MD Calculation on 80,000-node K-Computer 海报:基于planewave的8万节点k型计算机第一性原理MD计算

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.281

A. Kuroda, K. Minami, T. Yamasaki, J. Nara, J. Koga, T. Uda, T. Ohno

We show the efficiency of a first-principles electronic structure calculation code, PHASE on the massive-parallel super computer, K, which has 80,000 nodes. This code is based on plane-wave basis set, thus FFT routines are included. We succeeded in parallelization of FFT routines needed in our code by localizing each FFT calculation in small number of nodes, resulting in decreasing communication time required for FFT calculation. We also introduce multi-axis parallelization for bands and plane waves and then PHASE shows very high parallel efficiency. By using this code, we have investigated the structural stability of screw dislocations in silicon carbide, which has attracted much attention due to the semiconductor industry importance.

我们展示了第一原理电子结构计算代码PHASE在大规模并行超级计算机K上的效率，K有80,000个节点。此代码基于平面波基集，因此包含FFT例程。通过将每个FFT计算定位在少量节点上，我们成功地将代码中所需的FFT例程并行化，从而减少了FFT计算所需的通信时间。我们还引入了波段和平面波的多轴并行化，然后相位显示出很高的并行效率。利用这一程序，我们研究了碳化硅中螺旋位错的结构稳定性，这一问题在半导体工业中具有重要的意义。

引用次数: 0

Quality-Aware Data Management for Large Scale Scientific Applications 面向大规模科学应用的数据质量管理

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.114

Hongbo Zou, F. Zheng, M. Wolf, G. Eisenhauer, K. Schwan, H. Abbasi, Qing Liu, N. Podhorszki, S. Klasky

Increasingly larger scale simulations are generating an unprecedented amount of output data, causing researchers to explore new `data staging' methods that buffer, use, and/or reduce such data online rather than simply pushing it to disk. Leveraging the capabilities of data staging, this study explores the potential for data reduction via online data compression, first using general compression techniques and then proposing use-specific methods that permit users to define simple data queries that cause only the data identified by those queries to be emitted. Using online methods for code generation and deployment, with such dynamic data queries, end users can precisely identify the quality of information (QoI) of their output data, by explicitly determining what data may be lost vs. retained, in contrast to general-purpose lossy compression methods that do not provide such levels of control. The paper also describes the key elements of a quality-aware data management system (QADMS) for high-end machines enabled by this approach. Initial experimental results demonstrate that QADMS can effectively reduce data movement cost and improve the QoS while meeting the QoI constraint stated by users.

越来越大规模的模拟正在产生前所未有的大量输出数据，这促使研究人员探索新的“数据分级”方法，以缓冲、使用和/或在线减少这些数据，而不是简单地将其推送到磁盘上。利用数据分段的功能，本研究探索了通过在线数据压缩来减少数据的潜力，首先使用一般的压缩技术，然后提出了特定于使用的方法，允许用户定义简单的数据查询，只产生由这些查询识别的数据。使用在线方法进行代码生成和部署，通过这样的动态数据查询，最终用户可以通过显式地确定哪些数据可能丢失或保留，从而精确地识别其输出数据的信息质量(qi)，这与不提供此类控制级别的通用有损压缩方法形成对比。本文还描述了通过这种方法实现的高端机器的质量感知数据管理系统(QADMS)的关键要素。初步实验结果表明，QADMS在满足用户要求的QoS约束条件下，能有效降低数据移动成本，提高QoS。

{"title":"Quality-Aware Data Management for Large Scale Scientific Applications","authors":"Hongbo Zou, F. Zheng, M. Wolf, G. Eisenhauer, K. Schwan, H. Abbasi, Qing Liu, N. Podhorszki, S. Klasky","doi":"10.1109/SC.Companion.2012.114","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.114","url":null,"abstract":"Increasingly larger scale simulations are generating an unprecedented amount of output data, causing researchers to explore new `data staging' methods that buffer, use, and/or reduce such data online rather than simply pushing it to disk. Leveraging the capabilities of data staging, this study explores the potential for data reduction via online data compression, first using general compression techniques and then proposing use-specific methods that permit users to define simple data queries that cause only the data identified by those queries to be emitted. Using online methods for code generation and deployment, with such dynamic data queries, end users can precisely identify the quality of information (QoI) of their output data, by explicitly determining what data may be lost vs. retained, in contrast to general-purpose lossy compression methods that do not provide such levels of control. The paper also describes the key elements of a quality-aware data management system (QADMS) for high-end machines enabled by this approach. Initial experimental results demonstrate that QADMS can effectively reduce data movement cost and improve the QoS while meeting the QoI constraint stated by users.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"27 1","pages":"816-820"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81897312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Evaluating Workflow Tools with SDAG 用SDAG评估工作流工具

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.20

Muhammad Ali Amer, Robert Lucas

Workflow management systems (WMS) are typically comprised of or make use of multiple independent software components. The design and development of those components is typically drawn from functional requirements of scientific applications that utilize the corresponding WMS. Consequently, the WMS design reflects those core functional requirements in applications that it supports in the future. Whereas most design criteria are engineered to be as generic as possible, some design trade-offs may prove sub-optimal for certain new workflow applications. We argue that WMS design tradeoffs that emerge from a limited set of real-world applications can be minimized by the use of larger, more varied synthetic application datasets. We present SDAG, a tool for generating synthetic well formed workflows (WFWs) that span a varied space of synthetic WFWs around any reference workflow. These synthetic WFWs enable developers to test and evaluate WMS or their constituent software components on a broad range of workflows and enable more generic design criteria for WMS.

工作流管理系统(WMS)通常由或使用多个独立的软件组件组成。这些组件的设计和开发通常来自利用相应WMS的科学应用程序的功能需求。因此，WMS设计反映了它将来支持的应用程序中的那些核心功能需求。尽管大多数设计标准都被设计得尽可能通用，但对于某些新的工作流应用程序，一些设计权衡可能被证明不是最优的。我们认为，通过使用更大、更多样化的合成应用程序数据集，可以最大限度地减少来自有限的实际应用程序集的WMS设计权衡。我们介绍了SDAG，一个生成合成格式良好的工作流(wfw)的工具，它跨越了任何参考工作流周围的各种合成工作流空间。这些合成的wfw使开发人员能够在广泛的工作流范围内测试和评估WMS或其组成的软件组件，并为WMS提供更通用的设计标准。

引用次数: 6

The Sequoia Integration Study 红杉整合研究

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.359

K. Cupps

引用次数: 0

The K computer - Toward its productive applications to our life 计算机-走向它在我们生活中的生产性应用

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.344

M. Yokokawa

This article consists of a collection of slides from the author's conference presentation. The author concludes that HPC technology is essential for sustainable human life in the future and we have to promote HPC activities more and more. By K's powerful and stable computing capability, we expect useful results in science and engineering and break-through in research and development. We should pursue more realistic simulations by the future system. Japan will continue to contribute to HPC community.

本文由作者在会议上的演讲幻灯片组成。作者认为，高性能计算技术对未来人类的可持续生活至关重要，我们必须越来越多地推广高性能计算活动。通过K强大而稳定的计算能力，我们期待在科学和工程上取得有用的成果，在研发上取得突破。我们应该追求未来系统更真实的模拟。日本将继续为高性能计算社区做出贡献。

引用次数: 0

HOG: Distributed Hadoop MapReduce on the Grid HOG:网格上的分布式Hadoop MapReduce

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.154

Chen He, D. Weitzel, D. Swanson, Ying Lu

MapReduce is a powerful data processing platform for commercial and academic applications. In this paper, we build a novel Hadoop MapReduce framework executed on the Open Science Grid which spans multiple institutions across the United States - Hadoop On the Grid (HOG). It is different from previous MapReduce platforms that run on dedicated environments like clusters or clouds. HOG provides a free, elastic, and dynamic MapReduce environment on the opportunistic resources of the grid. In HOG, we improve Hadoop's fault tolerance for wide area data analysis by mapping data centers across the U.S. to virtual racks and creating multi-institution failure domains. Our modifications to the Hadoop framework are transparent to existing Hadoop MapReduce applications. In the evaluation, we successfully extend HOG to 1100 nodes on the grid. Additionally, we evaluate HOG with a simulated Facebook Hadoop MapReduce workload. We conclude that HOG's rapid scalability can provide comparable performance to a dedicated Hadoop cluster.

MapReduce是一个强大的数据处理平台，适用于商业和学术应用。在本文中，我们构建了一个新的Hadoop MapReduce框架，该框架在开放科学网格上执行，该网格横跨美国的多个机构- Hadoop on the Grid (HOG)。它不同于以前在集群或云等专用环境上运行的MapReduce平台。HOG在网格的机会资源上提供了一个自由、弹性和动态的MapReduce环境。在HOG中，我们通过将美国各地的数据中心映射到虚拟机架并创建多机构故障域，提高了Hadoop对广域数据分析的容错性。我们对Hadoop框架的修改对现有的Hadoop MapReduce应用程序是透明的。在评估中，我们成功地将HOG扩展到网格上的1100个节点。此外，我们用模拟的Facebook Hadoop MapReduce工作负载来评估HOG。我们得出结论，HOG的快速可伸缩性可以提供与专用Hadoop集群相当的性能。

引用次数: 37

Tight Coupling of R and Distributed Linear Algebra for High-Level Programming with Big Data 大数据高级编程中R与分布式线性代数的紧密耦合

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.113

Drew Schmidt, G. Ostrouchov, Wei-Chen Chen, Pragneshkumar B. Patel

We present a new distributed programming extension of the R programming language. By tightly coupling R to the well-known ScaLAPACK and MPI libraries, we are able to achieve highly scalable implementations of common statistical methods, allowing the user to analyze bigger datasets with R than ever before. Early benchmarks show great optimism for the project and its future.

本文提出了R编程语言的一种新的分布式编程扩展。通过将R与著名的ScaLAPACK和MPI库紧密耦合，我们能够实现常见统计方法的高度可扩展实现，允许用户使用R分析比以往更大的数据集。早期的基准测试显示了对项目及其未来的极大乐观。

引用次数: 12

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀