Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)最新文献_第8页

HPC Simulation Workflows for Engineering Innovation 工程创新的HPC仿真工作流程

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616556

M. Shephard, Cameron W. Smith

Efforts to develop component-based simulation workflows for industrial applications using XSEDE parallel computing systems are presented.

介绍了利用XSEDE并行计算系统为工业应用开发基于组件的仿真工作流的方法。

引用次数: 2

pbsacct: A Workload Analysis System for PBS-Based HPC Systems pbsacct:一个基于pbs的高性能计算系统的工作负载分析系统

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616539

Troy Baer, Douglas Johnson

The PBS family of resource management systems have historically not included workload analysis tools, and the currently available third-party workload analysis packages have often not had a way to identify the applications being run through the batch environment. This paper introduces the pbsacct system, which solves the application identification problem by storing job scripts with accounting information and allowing the development of site-specific heuristics to map job script patterns to applications. The system consists of a database, data ingestion tools, and command-line and web-based user interfaces. The paper will discuss the pbsacct system and deployments at two sites, the National Institute for Computational Sciences and the Ohio Supercomputer Center. Workload analyses for systems at each site are also discussed.

PBS资源管理系统家族历来不包括工作负载分析工具，目前可用的第三方工作负载分析包通常没有一种方法来识别在批处理环境中运行的应用程序。本文介绍了pbsacct系统，该系统通过使用会计信息存储作业脚本，并允许开发特定于站点的启发式方法将作业脚本模式映射到应用程序，从而解决了应用程序识别问题。该系统由数据库、数据摄取工具、命令行和基于web的用户界面组成。这篇论文将讨论pbsacct系统和在两个地点的部署，国家计算科学研究所和俄亥俄超级计算机中心。还讨论了每个站点系统的工作负载分析。

引用次数: 5

Towards Efficient Direct Semiclassical Molecular Dynamics for Complex Molecular Systems 面向复杂分子系统的高效直接半经典分子动力学

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616519

Y. Zhuang, M. Ceotto, W. Hase

Chemical processes are intrinsically quantum mechanical and quantum effects cannot be excluded a priori. Classical dynamics that use fitted force fields have been routinely applied to complex molecular systems. But since the force fields used in classical dynamics are tuned to fit experimental and/or electronic structure data, the harmonic potential approximation and the negligibility of quantum effects are artificially and ad hoc compensated. Also, fitting atomic forces is usually a trade-off between the desired accuracy and the human and computational effort required to construct them, and it is often biased by the functional forms chosen. Thus, it can happen that the force field is not transferable, i.e. it cannot be applied a priori to other molecular systems. In addition, force fields do not account for bond dissociation or excited vibrational processes, due to the harmonic approximation. To bypass these force field limitations, an alternative is the direct dynamics (on-the-fly) approach, with which the nuclear classical dynamics is coupled with atomic forces calculated from quantum mechanical electronic structure theory. Direct semiclassical molecular dynamics employs thousands of direct dynamics trajectories to calculate the Feynman Path Integral propagator, and reproduces quantitative quantum effects with errors often smaller than 1%, making it a very promising tool for including quantum effects for complex molecular systems. Direct semiclassical dynamics incurs much lower computation cost than purely quantum dynamics, but still calls for substantial reduction of computation cost for application to complex and interesting molecular systems on large HPC machines. The high computation cost of direct semiclassical dynamics comes from two sources. One is the large number of trajectories needed. The other is the enormous computation cost to calculate a single trajectory. In this talk, we present our efforts in containing computation costs from these two sources in order to make direct semi-classical dynamics feasible on modern HPC systems. A single trajectory of a direct semiclassical dynamics simulation may take days to weeks on a powerful multi-core processor. For instance, our on-going study of 10-atom glycine with the B3LYP/6-31G** electronic structure theory takes about 11.5 days on two quad-core Intel Xeon 2.26GHz processors (8 cores total) for a trajectory of 5000 time steps. To reduce the single trajectory calculation time, we developed a mathematical method to utilize directional data buried in previously calculated quantum data for future time steps, thereby reducing the expensive quantum mechanical electronic structure calculations. With the new method, we are able to reduce the computation time of a 5000-step trajectory to about 2 days with almost the same accuracy. A simulation study for glycine requires hundreds of thousands to even millions of trajectories when a usual semiclassical method is used. To reduce this requirement

化学过程本质上是量子力学的，不能先验地排除量子效应。使用拟合力场的经典动力学通常应用于复杂的分子系统。但是由于经典动力学中使用的力场被调整以适应实验和/或电子结构数据，谐波势近似和量子效应的可忽略性被人为地和特别地补偿。此外，拟合原子力通常需要在期望的精度与构建它们所需的人力和计算努力之间进行权衡，并且通常会受到所选择的功能形式的影响。因此，可能发生力场不可转移的情况，即它不能先验地应用于其他分子系统。此外，由于谐波近似，力场不考虑键解离或激发振动过程。为了绕过这些力场限制，另一种方法是直接动力学(动态)方法，将核经典动力学与量子力学电子结构理论计算的原子力相结合。直接半经典分子动力学使用数千个直接动力学轨迹来计算费曼路径积分传播子，并以误差通常小于1%的方式再现定量量子效应，使其成为一个非常有前途的工具，用于包括复杂分子系统的量子效应。直接半经典动力学的计算成本远低于纯量子动力学，但要在大型高性能计算机上应用于复杂而有趣的分子系统，仍然需要大幅降低计算成本。直接半经典动力学的高计算成本来源于两个方面。一个是需要大量的轨迹。另一个是计算单个轨迹的巨大计算成本。在这次演讲中，我们展示了我们在控制这两个来源的计算成本方面的努力，以便在现代高性能计算系统上实现直接半经典动力学。在功能强大的多核处理器上，直接半经典动力学模拟的单个轨迹可能需要几天到几周的时间。例如，我们正在进行的使用B3LYP/6-31G**电子结构理论的10原子甘氨酸研究在两台四核Intel Xeon 2.26GHz处理器(总共8核)上进行了5000个时间步的轨迹，耗时约11.5天。为了减少单轨迹计算时间，我们开发了一种数学方法，利用先前计算的量子数据中埋藏的方向数据进行未来时间步长，从而减少昂贵的量子力学电子结构计算。使用新方法，我们能够将5000步轨迹的计算时间减少到大约2天，并且几乎具有相同的精度。对于甘氨酸的模拟研究，当使用通常的半经典方法时，需要数十万甚至数百万个轨迹。为了降低这一要求，我们开发了一种半经典算法，该算法可以仅用少数轨迹计算分子光谱和振动特征函数，并忠实地再现具有数千个轨迹的计算结果。这使我们能够首先进行直接从头算半经典动力学，然后确定像甘氨酸这样大的分子的功率谱和非谐波振动频率。沿着这条路线，我们已经开发出一种方法来研究越来越复杂的系统。最近，我们在氨的共振伞反演中再现了量子力学隧道效应，使用了更少的轨迹。对于正在进行的10原子甘氨酸的研究，我们估计我们开发的代码只需要数百个轨迹就可以达到保真度，而通常的半经典方法需要数十万个轨迹。由于不同轨迹的计算是令人尴尬的并行，我们的轨迹缩减半经典方法结合我们的定向信息利用技术来减少单轨迹的计算时间，为解决复杂分子问题提供了可行的高性能计算解决方案。完成甘氨酸的直接半经典动力学模拟只需要两天的几百个多核节点，或者一个月的几十个多核节点。

{"title":"Towards Efficient Direct Semiclassical Molecular Dynamics for Complex Molecular Systems","authors":"Y. Zhuang, M. Ceotto, W. Hase","doi":"10.1145/2616498.2616519","DOIUrl":"https://doi.org/10.1145/2616498.2616519","url":null,"abstract":"Chemical processes are intrinsically quantum mechanical and quantum effects cannot be excluded a priori. Classical dynamics that use fitted force fields have been routinely applied to complex molecular systems. But since the force fields used in classical dynamics are tuned to fit experimental and/or electronic structure data, the harmonic potential approximation and the negligibility of quantum effects are artificially and ad hoc compensated. Also, fitting atomic forces is usually a trade-off between the desired accuracy and the human and computational effort required to construct them, and it is often biased by the functional forms chosen. Thus, it can happen that the force field is not transferable, i.e. it cannot be applied a priori to other molecular systems. In addition, force fields do not account for bond dissociation or excited vibrational processes, due to the harmonic approximation.\u0000 To bypass these force field limitations, an alternative is the direct dynamics (on-the-fly) approach, with which the nuclear classical dynamics is coupled with atomic forces calculated from quantum mechanical electronic structure theory. Direct semiclassical molecular dynamics employs thousands of direct dynamics trajectories to calculate the Feynman Path Integral propagator, and reproduces quantitative quantum effects with errors often smaller than 1%, making it a very promising tool for including quantum effects for complex molecular systems. Direct semiclassical dynamics incurs much lower computation cost than purely quantum dynamics, but still calls for substantial reduction of computation cost for application to complex and interesting molecular systems on large HPC machines.\u0000 The high computation cost of direct semiclassical dynamics comes from two sources. One is the large number of trajectories needed. The other is the enormous computation cost to calculate a single trajectory. In this talk, we present our efforts in containing computation costs from these two sources in order to make direct semi-classical dynamics feasible on modern HPC systems.\u0000 A single trajectory of a direct semiclassical dynamics simulation may take days to weeks on a powerful multi-core processor. For instance, our on-going study of 10-atom glycine with the B3LYP/6-31G** electronic structure theory takes about 11.5 days on two quad-core Intel Xeon 2.26GHz processors (8 cores total) for a trajectory of 5000 time steps. To reduce the single trajectory calculation time, we developed a mathematical method to utilize directional data buried in previously calculated quantum data for future time steps, thereby reducing the expensive quantum mechanical electronic structure calculations. With the new method, we are able to reduce the computation time of a 5000-step trajectory to about 2 days with almost the same accuracy.\u0000 A simulation study for glycine requires hundreds of thousands to even millions of trajectories when a usual semiclassical method is used. To reduce this requirement","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"161 1","pages":"26:1"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76450159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance Improvement and Workflow Development of Virtual Diffraction Calculations 虚拟衍射计算的性能改进和工作流程开发

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616552

S. Coleman, Sudhakar Pamidighantam, M. V. Moer, Yang Wang, L. Koesterke, D. Spearot

Electron and x-ray diffraction are well-established experimental methods used to explore the atomic scale structure of materials. In this work, a computational algorithm is presented to produce electron and x-ray diffraction patterns directly from atomistic simulation data. This algorithm advances beyond previous virtual diffraction methods by utilizing an ultra high-resolution mesh of reciprocal space which eliminates the need for a priori knowledge of the material structure. This paper focuses on (1) algorithmic advances necessary to improve performance, memory efficiency and scalability of the virtual diffraction calculation, and (2) the integration of the diffraction algorithm into a workflow across heterogeneous computing hardware for the purposes of integrating simulations, virtual diffraction calculations and visualization of electron and x-ray diffraction patterns.

电子和x射线衍射是公认的用于探索材料原子尺度结构的实验方法。在这项工作中，提出了一种直接从原子模拟数据产生电子和x射线衍射图样的计算算法。该算法通过利用超高分辨率的互易空间网格，消除了对材料结构先验知识的需要，从而超越了以前的虚拟衍射方法。本文着重于(1)提高虚拟衍射计算的性能、存储效率和可扩展性所需的算法进步，以及(2)将衍射算法集成到跨异构计算硬件的工作流中，以集成电子和x射线衍射图形的模拟、虚拟衍射计算和可视化。

引用次数: 5

Benchmarking SSD-Based Lustre File System Configurations 基于ssd的Lustre文件系统配置基准测试

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616544

Rick Mohr, Paul Peltz

Due to recent development efforts, ZFS on Linux is now a viable alternative to the traditional ldiskfs backend used for production Lustre file systems. Certain ZFS features, such as copy-on-write, make it even more appealing for systems utilizing SSD storage. To compare the relative benefits of ZFS and ldiskfs for SSD-based Lustre file systems, a systematic bottom-up benchmarking effort was undertaken utilizing Beacon, a Cray CS300-AC™ Cluster located at the University of Tennessee's Application Acceleration Center of Excellence (AACE). The Beacon cluster contains I/O nodes configured with Intel SSD drives to be deployed as a Lustre file system. Benchmark tests were run at all layers (SSD block device, RAID, ldiskfs/ZFS, Lustre) to measure performance as well as scaling behavior. We discuss the benchmark methodology used for performance testing and present results from a subset of these benchmarks. Anomalous I/O behavior discovered during the course of the benchmarking is also discussed.

由于最近的开发工作，Linux上的ZFS现在是用于生产Lustre文件系统的传统ldiskfs后端的可行替代方案。ZFS的某些特性，比如写时复制，使得它对使用SSD存储的系统更有吸引力。为了比较ZFS和ldiskfs对于基于ssd的Lustre文件系统的相对优势，我们利用Beacon(位于田纳西大学应用加速卓越中心(AACE)的Cray CS300-AC™集群)进行了系统的自下而上的基准测试。Beacon集群包含配置了Intel SSD硬盘的I/O节点，将作为Lustre文件系统部署。在所有层(SSD块设备、RAID、ldiskfs/ZFS、Lustre)上运行基准测试，以测量性能和伸缩行为。我们讨论了用于性能测试的基准测试方法，并给出了这些基准测试的一个子集的结果。还讨论了在基准测试过程中发现的异常I/O行为。

{"title":"Benchmarking SSD-Based Lustre File System Configurations","authors":"Rick Mohr, Paul Peltz","doi":"10.1145/2616498.2616544","DOIUrl":"https://doi.org/10.1145/2616498.2616544","url":null,"abstract":"Due to recent development efforts, ZFS on Linux is now a viable alternative to the traditional ldiskfs backend used for production Lustre file systems. Certain ZFS features, such as copy-on-write, make it even more appealing for systems utilizing SSD storage. To compare the relative benefits of ZFS and ldiskfs for SSD-based Lustre file systems, a systematic bottom-up benchmarking effort was undertaken utilizing Beacon, a Cray CS300-AC™ Cluster located at the University of Tennessee's Application Acceleration Center of Excellence (AACE). The Beacon cluster contains I/O nodes configured with Intel SSD drives to be deployed as a Lustre file system. Benchmark tests were run at all layers (SSD block device, RAID, ldiskfs/ZFS, Lustre) to measure performance as well as scaling behavior. We discuss the benchmark methodology used for performance testing and present results from a subset of these benchmarks. Anomalous I/O behavior discovered during the course of the benchmarking is also discussed.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"11 1","pages":"32:1-32:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73942925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Computational Anatomy Gateway: Leveraging XSEDE Computational Resources for Shape Analysis 计算解剖网关:利用XSEDE计算资源进行形状分析

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2014-07-13 DOI: 10.1145/2616498.2616553

Saurabh Jain, D. Tward, David S. Lee, Anthony Kolasny, Timothy Brown, J. Ratnanather, M. Miller, L. Younes

Computational Anatomy (CA) is a discipline focused on the quantitative analysis of the variability in biological shape. The Large Deformation Diffeomorphic Metric Mapping (LDDMM) is the key algorithm which assigns computable descriptors of anatomical shapes and a metric distance between shapes. This is achieved by describing populations of anatomical shapes as a group of diffeomorphic transformations applied to a template, and using a metric on the space of diffeomorphisms. LDDMM is being used extensively in the neuroimaging (www.mristudio.org) and cardiovascular imaging (www.cvrgrid.org) communities. There are two major components involved in shape analysis using this paradigm. First is the estimation of the template, and second is calculating the diffeomorphisms mapping the template to each subject in the population. Template estimation is a computationally expensive problem, which involves an iterative process, where each iteration calculates one diffeomorphism for each target. These can be calculated in parallel and independently of each other, and XSEDE is providing the resources, in particular those provided by the cluster Stampede, that make these computations for large populations possible. Mappings from the estimated template to each subject can also be run in parallel. In addition, the use of NVIDIA Tesla GPUs available on Stampede present the possibility of speeding up certain convolution-like calculations which lend themselves well to the General Purpose GPU computation model. We are also exploring the use of the available Xeon Phi Co-processors to increase the efficiency of our codes. This will have a huge impact on both the neuroimaging and cardiac imaging communities as we bring these shape analysis tools online for use by these communities through our webservice (www.mricloud.org), with the XSEDE Computational Anatomy Gateway providing the resources to handle the computational demands for large populations.

计算解剖学(CA)是一门专注于生物形状可变性定量分析的学科。大变形微分同构度量映射(LDDMM)是分配可计算的解剖形状描述符和形状之间度量距离的关键算法。这是通过将解剖形状的种群描述为应用于模板的一组微分同构变换，并使用微分同构空间上的度量来实现的。LDDMM被广泛应用于神经影像学(www.mristudio.org)和心血管影像学(www.cvrgrid.org)领域。在使用这种范式进行形状分析时，有两个主要组成部分。首先是模板的估计，其次是计算将模板映射到总体中每个受试者的微分同态。模板估计是一个计算量很大的问题，它涉及一个迭代过程，其中每次迭代计算每个目标的一个微分同构。这些计算可以并行且彼此独立地进行，XSEDE提供了资源，特别是由集群Stampede提供的资源，使这些计算成为可能。从预估模板到每个主题的映射也可以并行运行。此外，Stampede上可用的NVIDIA Tesla GPU的使用提供了加速某些类似卷积的计算的可能性，这些计算非常适合通用GPU计算模型。我们也在探索使用现有的Xeon Phi协处理器来提高我们代码的效率。这将对神经成像和心脏成像社区产生巨大的影响，因为我们将这些形状分析工具通过我们的网络服务(www.mricloud.org)提供给这些社区使用，XSEDE计算解剖网关提供资源来处理大量人口的计算需求。

{"title":"Computational Anatomy Gateway: Leveraging XSEDE Computational Resources for Shape Analysis","authors":"Saurabh Jain, D. Tward, David S. Lee, Anthony Kolasny, Timothy Brown, J. Ratnanather, M. Miller, L. Younes","doi":"10.1145/2616498.2616553","DOIUrl":"https://doi.org/10.1145/2616498.2616553","url":null,"abstract":"Computational Anatomy (CA) is a discipline focused on the quantitative analysis of the variability in biological shape. The Large Deformation Diffeomorphic Metric Mapping (LDDMM) is the key algorithm which assigns computable descriptors of anatomical shapes and a metric distance between shapes. This is achieved by describing populations of anatomical shapes as a group of diffeomorphic transformations applied to a template, and using a metric on the space of diffeomorphisms. LDDMM is being used extensively in the neuroimaging (www.mristudio.org) and cardiovascular imaging (www.cvrgrid.org) communities. There are two major components involved in shape analysis using this paradigm. First is the estimation of the template, and second is calculating the diffeomorphisms mapping the template to each subject in the population. Template estimation is a computationally expensive problem, which involves an iterative process, where each iteration calculates one diffeomorphism for each target. These can be calculated in parallel and independently of each other, and XSEDE is providing the resources, in particular those provided by the cluster Stampede, that make these computations for large populations possible. Mappings from the estimated template to each subject can also be run in parallel. In addition, the use of NVIDIA Tesla GPUs available on Stampede present the possibility of speeding up certain convolution-like calculations which lend themselves well to the General Purpose GPU computation model. We are also exploring the use of the available Xeon Phi Co-processors to increase the efficiency of our codes. This will have a huge impact on both the neuroimaging and cardiac imaging communities as we bring these shape analysis tools online for use by these communities through our webservice (www.mricloud.org), with the XSEDE Computational Anatomy Gateway providing the resources to handle the computational demands for large populations.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"4 1","pages":"54:1-54:6"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91161300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Evaluation of parallel and distributed file system technologies for XSEDE XSEDE的并行和分布式文件系统技术的评估

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2012-07-16 DOI: 10.1145/2335755.2335799

C. Jordan

A long-running goal of XSEDE and other large-scale cyberinfrastructure efforts, including the NSF's earlier TeraGrid project, has been the deployment of wide-area file systems within large-scale grid contexts. These technologies, ideally, combine the accessibility of local resources with the scale and diversity of national-scale cyberinfrastructure, and several deployments of such file systems have been successful, including the GPFS-WAN file system deployed at SDSC and the Data Capacitor-WAN file system deployed at Indiana University. In the XSEDE project a major data services area task is to deploy a single XSEDE-Wide File System to be available from all tier 1 service providers as well as major campus partners of XSEDE. In preparation for this deployment an XWFS evaluation process has been undertaken to determine the most appropriate technology to meet the technical and other requirements of the XSEDE community. GPFS, Lustre, and SLASH2, all technologies that have been used in wide-area network contexts for previous projects, were selected for intensive evaluation, and NFS was also examined for use in combination with these underlying file system technologies for use in overcoming platform compatibility issues. This presentation will describe the process and outcomes of the XSEDE-Wide File System evaluation effort, including a detailed discussion of the requirements development and evaluation process, benchmark development and test system deployment. We will also discuss additional factors that were determined to be relevant to the selection of a file system technology for widespread deployment in XSEDE, such as the robustness of documentation and the size and sophistication of the user community, as well as similar deployments in other large-scale cyberinfrastructure projects. Next steps for the XWFS effort will also be discussed in the context of the overall XSEDE systems engineering process.

XSEDE和其他大规模网络基础设施项目(包括NSF早期的TeraGrid项目)的长期目标是在大规模网格环境中部署广域文件系统。理想情况下，这些技术将本地资源的可访问性与国家规模网络基础设施的规模和多样性结合起来，并且此类文件系统的几个部署已经取得了成功，包括部署在SDSC的GPFS-WAN文件系统和部署在印第安纳大学的数据电容器- wan文件系统。在XSEDE项目中，一个主要的数据服务领域任务是部署一个单一的XSEDE wide文件系统，以供所有一级服务提供商以及XSEDE的主要校园合作伙伴使用。在准备部署时，已经进行了XWFS评估过程，以确定最适合的技术，以满足XSEDE社区的技术和其他需求。GPFS、Lustre和SLASH2都是以前项目中在广域网环境中使用过的技术，我们对它们进行了深入的评估，还研究了NFS与这些底层文件系统技术的结合使用，以克服平台兼容性问题。本演讲将描述XSEDE-Wide文件系统评估工作的过程和结果，包括对需求开发和评估过程、基准开发和测试系统部署的详细讨论。我们还将讨论被确定为在XSEDE中广泛部署的文件系统技术的选择相关的其他因素，例如文档的健壮性和用户社区的大小和复杂程度，以及其他大型网络基础设施项目中的类似部署。XWFS工作的后续步骤也将在整个XSEDE系统工程过程的上下文中讨论。

{"title":"Evaluation of parallel and distributed file system technologies for XSEDE","authors":"C. Jordan","doi":"10.1145/2335755.2335799","DOIUrl":"https://doi.org/10.1145/2335755.2335799","url":null,"abstract":"A long-running goal of XSEDE and other large-scale cyberinfrastructure efforts, including the NSF's earlier TeraGrid project, has been the deployment of wide-area file systems within large-scale grid contexts. These technologies, ideally, combine the accessibility of local resources with the scale and diversity of national-scale cyberinfrastructure, and several deployments of such file systems have been successful, including the GPFS-WAN file system deployed at SDSC and the Data Capacitor-WAN file system deployed at Indiana University. In the XSEDE project a major data services area task is to deploy a single XSEDE-Wide File System to be available from all tier 1 service providers as well as major campus partners of XSEDE. In preparation for this deployment an XWFS evaluation process has been undertaken to determine the most appropriate technology to meet the technical and other requirements of the XSEDE community. GPFS, Lustre, and SLASH2, all technologies that have been used in wide-area network contexts for previous projects, were selected for intensive evaluation, and NFS was also examined for use in combination with these underlying file system technologies for use in overcoming platform compatibility issues.\u0000 This presentation will describe the process and outcomes of the XSEDE-Wide File System evaluation effort, including a detailed discussion of the requirements development and evaluation process, benchmark development and test system deployment. We will also discuss additional factors that were determined to be relevant to the selection of a file system technology for widespread deployment in XSEDE, such as the robustness of documentation and the size and sophistication of the user community, as well as similar deployments in other large-scale cyberinfrastructure projects. Next steps for the XWFS effort will also be discussed in the context of the overall XSEDE systems engineering process.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"63 1","pages":"9:1"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74110062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Enabling large-scale scientific workflows on petascale resources using MPI master/worker 使用MPI master/worker在千兆级资源上启用大规模科学工作流

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2012-07-16 DOI: 10.1145/2335755.2335846

M. Rynge, S. Callaghan, E. Deelman, G. Juve, Gaurang Mehta, K. Vahi, P. Maechling

Computational scientists often need to execute large, loosely-coupled parallel applications such as workflows and bags of tasks in order to do their research. These applications are typically composed of many, short-running, serial tasks, which frequently demand large amounts of computation and storage. In order to produce results in a reasonable amount of time, scientists would like to execute these applications using petascale resources. In the past this has been a challenge because petascale systems are not designed to execute such workloads efficiently. In this paper we describe a new approach to executing large, fine-grained workflows on distributed petascale systems. Our solution involves partitioning the workflow into independent subgraphs, and then submitting each subgraph as a self-contained MPI job to the available resources (often remote). We describe how the partitioning and job management has been implemented in the Pegasus Workflow Management System. We also explain how this approach provides an end-to-end solution for challenges related to system architecture, queue policies and priorities, and application reuse and development. Finally, we describe how the system is being used to enable the execution of a very large seismic hazard analysis application on XSEDE resources.

计算科学家经常需要执行大型的、松散耦合的并行应用程序，如工作流和任务包，以便进行他们的研究。这些应用程序通常由许多短时间运行的串行任务组成，这些任务通常需要大量的计算和存储。为了在合理的时间内产生结果，科学家们希望使用千兆级资源来执行这些应用程序。在过去，这一直是一个挑战，因为千万亿级系统的设计不能有效地执行这种工作负载。在本文中，我们描述了一种在分布式千万亿级系统上执行大型、细粒度工作流的新方法。我们的解决方案包括将工作流划分为独立的子图，然后将每个子图作为自包含的MPI作业提交给可用资源(通常是远程资源)。我们描述了如何在Pegasus工作流管理系统中实现分区和作业管理。我们还解释了这种方法如何为与系统架构、队列策略和优先级以及应用程序重用和开发相关的挑战提供端到端解决方案。最后，我们描述了如何使用该系统在XSEDE资源上执行非常大的地震危害分析应用程序。

{"title":"Enabling large-scale scientific workflows on petascale resources using MPI master/worker","authors":"M. Rynge, S. Callaghan, E. Deelman, G. Juve, Gaurang Mehta, K. Vahi, P. Maechling","doi":"10.1145/2335755.2335846","DOIUrl":"https://doi.org/10.1145/2335755.2335846","url":null,"abstract":"Computational scientists often need to execute large, loosely-coupled parallel applications such as workflows and bags of tasks in order to do their research. These applications are typically composed of many, short-running, serial tasks, which frequently demand large amounts of computation and storage. In order to produce results in a reasonable amount of time, scientists would like to execute these applications using petascale resources. In the past this has been a challenge because petascale systems are not designed to execute such workloads efficiently. In this paper we describe a new approach to executing large, fine-grained workflows on distributed petascale systems. Our solution involves partitioning the workflow into independent subgraphs, and then submitting each subgraph as a self-contained MPI job to the available resources (often remote). We describe how the partitioning and job management has been implemented in the Pegasus Workflow Management System. We also explain how this approach provides an end-to-end solution for challenges related to system architecture, queue policies and priorities, and application reuse and development. Finally, we describe how the system is being used to enable the execution of a very large seismic hazard analysis application on XSEDE resources.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"14 1","pages":"49:1-49:8"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75079515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Roadmaps, not blueprints: paving the way to science gateway success 路线图，而不是蓝图:为科学门户的成功铺平道路

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2012-07-16 DOI: 10.1145/2335755.2335837

Katherine A. Lawrence, Nancy Wilkins-Diehr

As science today grows ever more digital, it poses exciting challenges and opportunities for researchers. The existence of science gateways---and the advanced cyberinfrastructure (CI) tools and resources behind the accessible Web interfaces---can significantly improve the productivity of researchers facing the most difficult challenges, but designing the most effective tools requires an investment of time, effort, and money. Because all gateways cannot be funded in the long term, it is important to identify the characteristics of successful gateways and make early efforts to incorporate whatever strategies will set up new gateways for success. Our research seeks to identify why some gateway projects change the way science is conducted in a given community while other gateways do not. Through a series of five full-day, iterative, multidisciplinary focus groups, we have gathered input and insights from sixty-six participants representing a diverse array of gateways and portals, funding organizations, research institutions, and industrial backgrounds. In this paper, we describe the key factors for success as well as the situational enablers of these factors. These findings are grouped into five main topical areas---the builders, the users, the roadmaps, the gateways, and the support systems---but we find that many of these factors and enablers are intertwined and inseparable, and there is no easy prescription for success.

随着今天的科学越来越数字化，它给研究人员带来了令人兴奋的挑战和机遇。科学网关的存在——以及可访问的Web接口背后的高级网络基础设施(CI)工具和资源——可以显著提高面临最困难挑战的研究人员的生产力，但是设计最有效的工具需要投入时间、精力和金钱。因为所有的网关都不可能在长期内获得资金支持，所以确定成功网关的特征并尽早做出努力，将任何能够成功建立新网关的策略结合起来，这一点非常重要。我们的研究旨在确定为什么一些门户项目改变了在特定社区进行科学研究的方式，而其他门户却没有。通过一系列为期5天的、反复的、多学科的焦点小组，我们收集了66名参与者的意见和见解，他们代表了各种各样的门户和门户、资助组织、研究机构和行业背景。在本文中，我们描述了成功的关键因素以及这些因素的情景促成因素。这些发现被分为五个主要的主题领域——建设者、用户、路线图、网关和支持系统——但我们发现，许多这些因素和推动因素是交织在一起的，不可分割的，没有简单的成功处方。

{"title":"Roadmaps, not blueprints: paving the way to science gateway success","authors":"Katherine A. Lawrence, Nancy Wilkins-Diehr","doi":"10.1145/2335755.2335837","DOIUrl":"https://doi.org/10.1145/2335755.2335837","url":null,"abstract":"As science today grows ever more digital, it poses exciting challenges and opportunities for researchers. The existence of science gateways---and the advanced cyberinfrastructure (CI) tools and resources behind the accessible Web interfaces---can significantly improve the productivity of researchers facing the most difficult challenges, but designing the most effective tools requires an investment of time, effort, and money. Because all gateways cannot be funded in the long term, it is important to identify the characteristics of successful gateways and make early efforts to incorporate whatever strategies will set up new gateways for success. Our research seeks to identify why some gateway projects change the way science is conducted in a given community while other gateways do not. Through a series of five full-day, iterative, multidisciplinary focus groups, we have gathered input and insights from sixty-six participants representing a diverse array of gateways and portals, funding organizations, research institutions, and industrial backgrounds. In this paper, we describe the key factors for success as well as the situational enablers of these factors. These findings are grouped into five main topical areas---the builders, the users, the roadmaps, the gateways, and the support systems---but we find that many of these factors and enablers are intertwined and inseparable, and there is no easy prescription for success.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"88 1","pages":"40:1-40:8"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79132589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Comparing the performance of group detection algorithm in serial and parallel processing environments 比较了组检测算法在串行和并行处理环境下的性能

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

Pub Date : 2012-07-16 DOI: 10.1145/2335755.2335817

Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada

Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.

由于需要增加对在线环境中的群体和团队的理解，在不分组数据的情况下，开发一种从个人集合中进行群体识别的算法受到了极大的关注。该研究利用游戏数据库中的空间、时间、任务和玩家的虚拟行为指标，开发出一种算法来检测群体。组检测算法主要是为串行处理环境开发的，后来修改为允许在Gordon上并行处理。对于代表192天游戏体验的数据集合(大约140 gb的日志数据)，当在单个处理器上运行时，计算需要266分钟来完成分析的主要步骤。同样的计算在有16个处理器的Gordon上运行需要25分钟。在Gordon上提供的大量计算节点和丰富的共享内存环境将我们的分析性能提高了11倍。除了展示节省时间和精力的可能性之外，本研究还强调了将串行检测算法转换为并行环境的一些经验教训。

{"title":"Comparing the performance of group detection algorithm in serial and parallel processing environments","authors":"Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada","doi":"10.1145/2335755.2335817","DOIUrl":"https://doi.org/10.1145/2335755.2335817","url":null,"abstract":"Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"29 1","pages":"21:1-21:4"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84924431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2