Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)最新文献
Efforts to develop component-based simulation workflows for industrial applications using XSEDE parallel computing systems are presented.
介绍了利用XSEDE并行计算系统为工业应用开发基于组件的仿真工作流的方法。
{"title":"HPC Simulation Workflows for Engineering Innovation","authors":"M. Shephard, Cameron W. Smith","doi":"10.1145/2616498.2616556","DOIUrl":"https://doi.org/10.1145/2616498.2616556","url":null,"abstract":"Efforts to develop component-based simulation workflows for industrial applications using XSEDE parallel computing systems are presented.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"50 1","pages":"56:1-56:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76407791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The PBS family of resource management systems have historically not included workload analysis tools, and the currently available third-party workload analysis packages have often not had a way to identify the applications being run through the batch environment. This paper introduces the pbsacct system, which solves the application identification problem by storing job scripts with accounting information and allowing the development of site-specific heuristics to map job script patterns to applications. The system consists of a database, data ingestion tools, and command-line and web-based user interfaces. The paper will discuss the pbsacct system and deployments at two sites, the National Institute for Computational Sciences and the Ohio Supercomputer Center. Workload analyses for systems at each site are also discussed.
{"title":"pbsacct: A Workload Analysis System for PBS-Based HPC Systems","authors":"Troy Baer, Douglas Johnson","doi":"10.1145/2616498.2616539","DOIUrl":"https://doi.org/10.1145/2616498.2616539","url":null,"abstract":"The PBS family of resource management systems have historically not included workload analysis tools, and the currently available third-party workload analysis packages have often not had a way to identify the applications being run through the batch environment. This paper introduces the pbsacct system, which solves the application identification problem by storing job scripts with accounting information and allowing the development of site-specific heuristics to map job script patterns to applications. The system consists of a database, data ingestion tools, and command-line and web-based user interfaces. The paper will discuss the pbsacct system and deployments at two sites, the National Institute for Computational Sciences and the Ohio Supercomputer Center. Workload analyses for systems at each site are also discussed.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"52 1","pages":"42:1-42:6"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73550378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chemical processes are intrinsically quantum mechanical and quantum effects cannot be excluded a priori. Classical dynamics that use fitted force fields have been routinely applied to complex molecular systems. But since the force fields used in classical dynamics are tuned to fit experimental and/or electronic structure data, the harmonic potential approximation and the negligibility of quantum effects are artificially and ad hoc compensated. Also, fitting atomic forces is usually a trade-off between the desired accuracy and the human and computational effort required to construct them, and it is often biased by the functional forms chosen. Thus, it can happen that the force field is not transferable, i.e. it cannot be applied a priori to other molecular systems. In addition, force fields do not account for bond dissociation or excited vibrational processes, due to the harmonic approximation. To bypass these force field limitations, an alternative is the direct dynamics (on-the-fly) approach, with which the nuclear classical dynamics is coupled with atomic forces calculated from quantum mechanical electronic structure theory. Direct semiclassical molecular dynamics employs thousands of direct dynamics trajectories to calculate the Feynman Path Integral propagator, and reproduces quantitative quantum effects with errors often smaller than 1%, making it a very promising tool for including quantum effects for complex molecular systems. Direct semiclassical dynamics incurs much lower computation cost than purely quantum dynamics, but still calls for substantial reduction of computation cost for application to complex and interesting molecular systems on large HPC machines. The high computation cost of direct semiclassical dynamics comes from two sources. One is the large number of trajectories needed. The other is the enormous computation cost to calculate a single trajectory. In this talk, we present our efforts in containing computation costs from these two sources in order to make direct semi-classical dynamics feasible on modern HPC systems. A single trajectory of a direct semiclassical dynamics simulation may take days to weeks on a powerful multi-core processor. For instance, our on-going study of 10-atom glycine with the B3LYP/6-31G** electronic structure theory takes about 11.5 days on two quad-core Intel Xeon 2.26GHz processors (8 cores total) for a trajectory of 5000 time steps. To reduce the single trajectory calculation time, we developed a mathematical method to utilize directional data buried in previously calculated quantum data for future time steps, thereby reducing the expensive quantum mechanical electronic structure calculations. With the new method, we are able to reduce the computation time of a 5000-step trajectory to about 2 days with almost the same accuracy. A simulation study for glycine requires hundreds of thousands to even millions of trajectories when a usual semiclassical method is used. To reduce this requirement
{"title":"Towards Efficient Direct Semiclassical Molecular Dynamics for Complex Molecular Systems","authors":"Y. Zhuang, M. Ceotto, W. Hase","doi":"10.1145/2616498.2616519","DOIUrl":"https://doi.org/10.1145/2616498.2616519","url":null,"abstract":"Chemical processes are intrinsically quantum mechanical and quantum effects cannot be excluded a priori. Classical dynamics that use fitted force fields have been routinely applied to complex molecular systems. But since the force fields used in classical dynamics are tuned to fit experimental and/or electronic structure data, the harmonic potential approximation and the negligibility of quantum effects are artificially and ad hoc compensated. Also, fitting atomic forces is usually a trade-off between the desired accuracy and the human and computational effort required to construct them, and it is often biased by the functional forms chosen. Thus, it can happen that the force field is not transferable, i.e. it cannot be applied a priori to other molecular systems. In addition, force fields do not account for bond dissociation or excited vibrational processes, due to the harmonic approximation.\u0000 To bypass these force field limitations, an alternative is the direct dynamics (on-the-fly) approach, with which the nuclear classical dynamics is coupled with atomic forces calculated from quantum mechanical electronic structure theory. Direct semiclassical molecular dynamics employs thousands of direct dynamics trajectories to calculate the Feynman Path Integral propagator, and reproduces quantitative quantum effects with errors often smaller than 1%, making it a very promising tool for including quantum effects for complex molecular systems. Direct semiclassical dynamics incurs much lower computation cost than purely quantum dynamics, but still calls for substantial reduction of computation cost for application to complex and interesting molecular systems on large HPC machines.\u0000 The high computation cost of direct semiclassical dynamics comes from two sources. One is the large number of trajectories needed. The other is the enormous computation cost to calculate a single trajectory. In this talk, we present our efforts in containing computation costs from these two sources in order to make direct semi-classical dynamics feasible on modern HPC systems.\u0000 A single trajectory of a direct semiclassical dynamics simulation may take days to weeks on a powerful multi-core processor. For instance, our on-going study of 10-atom glycine with the B3LYP/6-31G** electronic structure theory takes about 11.5 days on two quad-core Intel Xeon 2.26GHz processors (8 cores total) for a trajectory of 5000 time steps. To reduce the single trajectory calculation time, we developed a mathematical method to utilize directional data buried in previously calculated quantum data for future time steps, thereby reducing the expensive quantum mechanical electronic structure calculations. With the new method, we are able to reduce the computation time of a 5000-step trajectory to about 2 days with almost the same accuracy.\u0000 A simulation study for glycine requires hundreds of thousands to even millions of trajectories when a usual semiclassical method is used. To reduce this requirement","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"161 1","pages":"26:1"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76450159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Coleman, Sudhakar Pamidighantam, M. V. Moer, Yang Wang, L. Koesterke, D. Spearot
Electron and x-ray diffraction are well-established experimental methods used to explore the atomic scale structure of materials. In this work, a computational algorithm is presented to produce electron and x-ray diffraction patterns directly from atomistic simulation data. This algorithm advances beyond previous virtual diffraction methods by utilizing an ultra high-resolution mesh of reciprocal space which eliminates the need for a priori knowledge of the material structure. This paper focuses on (1) algorithmic advances necessary to improve performance, memory efficiency and scalability of the virtual diffraction calculation, and (2) the integration of the diffraction algorithm into a workflow across heterogeneous computing hardware for the purposes of integrating simulations, virtual diffraction calculations and visualization of electron and x-ray diffraction patterns.
{"title":"Performance Improvement and Workflow Development of Virtual Diffraction Calculations","authors":"S. Coleman, Sudhakar Pamidighantam, M. V. Moer, Yang Wang, L. Koesterke, D. Spearot","doi":"10.1145/2616498.2616552","DOIUrl":"https://doi.org/10.1145/2616498.2616552","url":null,"abstract":"Electron and x-ray diffraction are well-established experimental methods used to explore the atomic scale structure of materials. In this work, a computational algorithm is presented to produce electron and x-ray diffraction patterns directly from atomistic simulation data. This algorithm advances beyond previous virtual diffraction methods by utilizing an ultra high-resolution mesh of reciprocal space which eliminates the need for a priori knowledge of the material structure. This paper focuses on (1) algorithmic advances necessary to improve performance, memory efficiency and scalability of the virtual diffraction calculation, and (2) the integration of the diffraction algorithm into a workflow across heterogeneous computing hardware for the purposes of integrating simulations, virtual diffraction calculations and visualization of electron and x-ray diffraction patterns.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"32 1","pages":"61:1-61:7"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77992655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to recent development efforts, ZFS on Linux is now a viable alternative to the traditional ldiskfs backend used for production Lustre file systems. Certain ZFS features, such as copy-on-write, make it even more appealing for systems utilizing SSD storage. To compare the relative benefits of ZFS and ldiskfs for SSD-based Lustre file systems, a systematic bottom-up benchmarking effort was undertaken utilizing Beacon, a Cray CS300-AC™ Cluster located at the University of Tennessee's Application Acceleration Center of Excellence (AACE). The Beacon cluster contains I/O nodes configured with Intel SSD drives to be deployed as a Lustre file system. Benchmark tests were run at all layers (SSD block device, RAID, ldiskfs/ZFS, Lustre) to measure performance as well as scaling behavior. We discuss the benchmark methodology used for performance testing and present results from a subset of these benchmarks. Anomalous I/O behavior discovered during the course of the benchmarking is also discussed.
{"title":"Benchmarking SSD-Based Lustre File System Configurations","authors":"Rick Mohr, Paul Peltz","doi":"10.1145/2616498.2616544","DOIUrl":"https://doi.org/10.1145/2616498.2616544","url":null,"abstract":"Due to recent development efforts, ZFS on Linux is now a viable alternative to the traditional ldiskfs backend used for production Lustre file systems. Certain ZFS features, such as copy-on-write, make it even more appealing for systems utilizing SSD storage. To compare the relative benefits of ZFS and ldiskfs for SSD-based Lustre file systems, a systematic bottom-up benchmarking effort was undertaken utilizing Beacon, a Cray CS300-AC™ Cluster located at the University of Tennessee's Application Acceleration Center of Excellence (AACE). The Beacon cluster contains I/O nodes configured with Intel SSD drives to be deployed as a Lustre file system. Benchmark tests were run at all layers (SSD block device, RAID, ldiskfs/ZFS, Lustre) to measure performance as well as scaling behavior. We discuss the benchmark methodology used for performance testing and present results from a subset of these benchmarks. Anomalous I/O behavior discovered during the course of the benchmarking is also discussed.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"11 1","pages":"32:1-32:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73942925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saurabh Jain, D. Tward, David S. Lee, Anthony Kolasny, Timothy Brown, J. Ratnanather, M. Miller, L. Younes
Computational Anatomy (CA) is a discipline focused on the quantitative analysis of the variability in biological shape. The Large Deformation Diffeomorphic Metric Mapping (LDDMM) is the key algorithm which assigns computable descriptors of anatomical shapes and a metric distance between shapes. This is achieved by describing populations of anatomical shapes as a group of diffeomorphic transformations applied to a template, and using a metric on the space of diffeomorphisms. LDDMM is being used extensively in the neuroimaging (www.mristudio.org) and cardiovascular imaging (www.cvrgrid.org) communities. There are two major components involved in shape analysis using this paradigm. First is the estimation of the template, and second is calculating the diffeomorphisms mapping the template to each subject in the population. Template estimation is a computationally expensive problem, which involves an iterative process, where each iteration calculates one diffeomorphism for each target. These can be calculated in parallel and independently of each other, and XSEDE is providing the resources, in particular those provided by the cluster Stampede, that make these computations for large populations possible. Mappings from the estimated template to each subject can also be run in parallel. In addition, the use of NVIDIA Tesla GPUs available on Stampede present the possibility of speeding up certain convolution-like calculations which lend themselves well to the General Purpose GPU computation model. We are also exploring the use of the available Xeon Phi Co-processors to increase the efficiency of our codes. This will have a huge impact on both the neuroimaging and cardiac imaging communities as we bring these shape analysis tools online for use by these communities through our webservice (www.mricloud.org), with the XSEDE Computational Anatomy Gateway providing the resources to handle the computational demands for large populations.
计算解剖学(CA)是一门专注于生物形状可变性定量分析的学科。大变形微分同构度量映射(LDDMM)是分配可计算的解剖形状描述符和形状之间度量距离的关键算法。这是通过将解剖形状的种群描述为应用于模板的一组微分同构变换,并使用微分同构空间上的度量来实现的。LDDMM被广泛应用于神经影像学(www.mristudio.org)和心血管影像学(www.cvrgrid.org)领域。在使用这种范式进行形状分析时,有两个主要组成部分。首先是模板的估计,其次是计算将模板映射到总体中每个受试者的微分同态。模板估计是一个计算量很大的问题,它涉及一个迭代过程,其中每次迭代计算每个目标的一个微分同构。这些计算可以并行且彼此独立地进行,XSEDE提供了资源,特别是由集群Stampede提供的资源,使这些计算成为可能。从预估模板到每个主题的映射也可以并行运行。此外,Stampede上可用的NVIDIA Tesla GPU的使用提供了加速某些类似卷积的计算的可能性,这些计算非常适合通用GPU计算模型。我们也在探索使用现有的Xeon Phi协处理器来提高我们代码的效率。这将对神经成像和心脏成像社区产生巨大的影响,因为我们将这些形状分析工具通过我们的网络服务(www.mricloud.org)提供给这些社区使用,XSEDE计算解剖网关提供资源来处理大量人口的计算需求。
{"title":"Computational Anatomy Gateway: Leveraging XSEDE Computational Resources for Shape Analysis","authors":"Saurabh Jain, D. Tward, David S. Lee, Anthony Kolasny, Timothy Brown, J. Ratnanather, M. Miller, L. Younes","doi":"10.1145/2616498.2616553","DOIUrl":"https://doi.org/10.1145/2616498.2616553","url":null,"abstract":"Computational Anatomy (CA) is a discipline focused on the quantitative analysis of the variability in biological shape. The Large Deformation Diffeomorphic Metric Mapping (LDDMM) is the key algorithm which assigns computable descriptors of anatomical shapes and a metric distance between shapes. This is achieved by describing populations of anatomical shapes as a group of diffeomorphic transformations applied to a template, and using a metric on the space of diffeomorphisms. LDDMM is being used extensively in the neuroimaging (www.mristudio.org) and cardiovascular imaging (www.cvrgrid.org) communities. There are two major components involved in shape analysis using this paradigm. First is the estimation of the template, and second is calculating the diffeomorphisms mapping the template to each subject in the population. Template estimation is a computationally expensive problem, which involves an iterative process, where each iteration calculates one diffeomorphism for each target. These can be calculated in parallel and independently of each other, and XSEDE is providing the resources, in particular those provided by the cluster Stampede, that make these computations for large populations possible. Mappings from the estimated template to each subject can also be run in parallel. In addition, the use of NVIDIA Tesla GPUs available on Stampede present the possibility of speeding up certain convolution-like calculations which lend themselves well to the General Purpose GPU computation model. We are also exploring the use of the available Xeon Phi Co-processors to increase the efficiency of our codes. This will have a huge impact on both the neuroimaging and cardiac imaging communities as we bring these shape analysis tools online for use by these communities through our webservice (www.mricloud.org), with the XSEDE Computational Anatomy Gateway providing the resources to handle the computational demands for large populations.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"4 1","pages":"54:1-54:6"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91161300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A long-running goal of XSEDE and other large-scale cyberinfrastructure efforts, including the NSF's earlier TeraGrid project, has been the deployment of wide-area file systems within large-scale grid contexts. These technologies, ideally, combine the accessibility of local resources with the scale and diversity of national-scale cyberinfrastructure, and several deployments of such file systems have been successful, including the GPFS-WAN file system deployed at SDSC and the Data Capacitor-WAN file system deployed at Indiana University. In the XSEDE project a major data services area task is to deploy a single XSEDE-Wide File System to be available from all tier 1 service providers as well as major campus partners of XSEDE. In preparation for this deployment an XWFS evaluation process has been undertaken to determine the most appropriate technology to meet the technical and other requirements of the XSEDE community. GPFS, Lustre, and SLASH2, all technologies that have been used in wide-area network contexts for previous projects, were selected for intensive evaluation, and NFS was also examined for use in combination with these underlying file system technologies for use in overcoming platform compatibility issues. This presentation will describe the process and outcomes of the XSEDE-Wide File System evaluation effort, including a detailed discussion of the requirements development and evaluation process, benchmark development and test system deployment. We will also discuss additional factors that were determined to be relevant to the selection of a file system technology for widespread deployment in XSEDE, such as the robustness of documentation and the size and sophistication of the user community, as well as similar deployments in other large-scale cyberinfrastructure projects. Next steps for the XWFS effort will also be discussed in the context of the overall XSEDE systems engineering process.
{"title":"Evaluation of parallel and distributed file system technologies for XSEDE","authors":"C. Jordan","doi":"10.1145/2335755.2335799","DOIUrl":"https://doi.org/10.1145/2335755.2335799","url":null,"abstract":"A long-running goal of XSEDE and other large-scale cyberinfrastructure efforts, including the NSF's earlier TeraGrid project, has been the deployment of wide-area file systems within large-scale grid contexts. These technologies, ideally, combine the accessibility of local resources with the scale and diversity of national-scale cyberinfrastructure, and several deployments of such file systems have been successful, including the GPFS-WAN file system deployed at SDSC and the Data Capacitor-WAN file system deployed at Indiana University. In the XSEDE project a major data services area task is to deploy a single XSEDE-Wide File System to be available from all tier 1 service providers as well as major campus partners of XSEDE. In preparation for this deployment an XWFS evaluation process has been undertaken to determine the most appropriate technology to meet the technical and other requirements of the XSEDE community. GPFS, Lustre, and SLASH2, all technologies that have been used in wide-area network contexts for previous projects, were selected for intensive evaluation, and NFS was also examined for use in combination with these underlying file system technologies for use in overcoming platform compatibility issues.\u0000 This presentation will describe the process and outcomes of the XSEDE-Wide File System evaluation effort, including a detailed discussion of the requirements development and evaluation process, benchmark development and test system deployment. We will also discuss additional factors that were determined to be relevant to the selection of a file system technology for widespread deployment in XSEDE, such as the robustness of documentation and the size and sophistication of the user community, as well as similar deployments in other large-scale cyberinfrastructure projects. Next steps for the XWFS effort will also be discussed in the context of the overall XSEDE systems engineering process.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"63 1","pages":"9:1"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74110062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Rynge, S. Callaghan, E. Deelman, G. Juve, Gaurang Mehta, K. Vahi, P. Maechling
Computational scientists often need to execute large, loosely-coupled parallel applications such as workflows and bags of tasks in order to do their research. These applications are typically composed of many, short-running, serial tasks, which frequently demand large amounts of computation and storage. In order to produce results in a reasonable amount of time, scientists would like to execute these applications using petascale resources. In the past this has been a challenge because petascale systems are not designed to execute such workloads efficiently. In this paper we describe a new approach to executing large, fine-grained workflows on distributed petascale systems. Our solution involves partitioning the workflow into independent subgraphs, and then submitting each subgraph as a self-contained MPI job to the available resources (often remote). We describe how the partitioning and job management has been implemented in the Pegasus Workflow Management System. We also explain how this approach provides an end-to-end solution for challenges related to system architecture, queue policies and priorities, and application reuse and development. Finally, we describe how the system is being used to enable the execution of a very large seismic hazard analysis application on XSEDE resources.
{"title":"Enabling large-scale scientific workflows on petascale resources using MPI master/worker","authors":"M. Rynge, S. Callaghan, E. Deelman, G. Juve, Gaurang Mehta, K. Vahi, P. Maechling","doi":"10.1145/2335755.2335846","DOIUrl":"https://doi.org/10.1145/2335755.2335846","url":null,"abstract":"Computational scientists often need to execute large, loosely-coupled parallel applications such as workflows and bags of tasks in order to do their research. These applications are typically composed of many, short-running, serial tasks, which frequently demand large amounts of computation and storage. In order to produce results in a reasonable amount of time, scientists would like to execute these applications using petascale resources. In the past this has been a challenge because petascale systems are not designed to execute such workloads efficiently. In this paper we describe a new approach to executing large, fine-grained workflows on distributed petascale systems. Our solution involves partitioning the workflow into independent subgraphs, and then submitting each subgraph as a self-contained MPI job to the available resources (often remote). We describe how the partitioning and job management has been implemented in the Pegasus Workflow Management System. We also explain how this approach provides an end-to-end solution for challenges related to system architecture, queue policies and priorities, and application reuse and development. Finally, we describe how the system is being used to enable the execution of a very large seismic hazard analysis application on XSEDE resources.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"14 1","pages":"49:1-49:8"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75079515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As science today grows ever more digital, it poses exciting challenges and opportunities for researchers. The existence of science gateways---and the advanced cyberinfrastructure (CI) tools and resources behind the accessible Web interfaces---can significantly improve the productivity of researchers facing the most difficult challenges, but designing the most effective tools requires an investment of time, effort, and money. Because all gateways cannot be funded in the long term, it is important to identify the characteristics of successful gateways and make early efforts to incorporate whatever strategies will set up new gateways for success. Our research seeks to identify why some gateway projects change the way science is conducted in a given community while other gateways do not. Through a series of five full-day, iterative, multidisciplinary focus groups, we have gathered input and insights from sixty-six participants representing a diverse array of gateways and portals, funding organizations, research institutions, and industrial backgrounds. In this paper, we describe the key factors for success as well as the situational enablers of these factors. These findings are grouped into five main topical areas---the builders, the users, the roadmaps, the gateways, and the support systems---but we find that many of these factors and enablers are intertwined and inseparable, and there is no easy prescription for success.
{"title":"Roadmaps, not blueprints: paving the way to science gateway success","authors":"Katherine A. Lawrence, Nancy Wilkins-Diehr","doi":"10.1145/2335755.2335837","DOIUrl":"https://doi.org/10.1145/2335755.2335837","url":null,"abstract":"As science today grows ever more digital, it poses exciting challenges and opportunities for researchers. The existence of science gateways---and the advanced cyberinfrastructure (CI) tools and resources behind the accessible Web interfaces---can significantly improve the productivity of researchers facing the most difficult challenges, but designing the most effective tools requires an investment of time, effort, and money. Because all gateways cannot be funded in the long term, it is important to identify the characteristics of successful gateways and make early efforts to incorporate whatever strategies will set up new gateways for success. Our research seeks to identify why some gateway projects change the way science is conducted in a given community while other gateways do not. Through a series of five full-day, iterative, multidisciplinary focus groups, we have gathered input and insights from sixty-six participants representing a diverse array of gateways and portals, funding organizations, research institutions, and industrial backgrounds. In this paper, we describe the key factors for success as well as the situational enablers of these factors. These findings are grouped into five main topical areas---the builders, the users, the roadmaps, the gateways, and the support systems---but we find that many of these factors and enablers are intertwined and inseparable, and there is no easy prescription for success.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"88 1","pages":"40:1-40:8"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79132589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada
Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.
{"title":"Comparing the performance of group detection algorithm in serial and parallel processing environments","authors":"Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada","doi":"10.1145/2335755.2335817","DOIUrl":"https://doi.org/10.1145/2335755.2335817","url":null,"abstract":"Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"29 1","pages":"21:1-21:4"},"PeriodicalIF":0.0,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84924431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)