首页 > 最新文献

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献

英文 中文
Optimize Parallel Data Access in Big Data Processing 优化大数据处理中的并行数据访问
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.168
Jiangling Yin, Jun Wang
Recent years the Hadoop Distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based Spark frameworks, which can efficiently support iterative and interactive data analysis in memory. The first part of my dissertation mainly focuses on studying parallel data accession distributed file systems, e.g, HDFS. Since the distributed I/O resources and global data distribution are often not taken into consideration, the data requests from parallel processes/executors will unfortunately be served in a remoter imbalanced fashion on the storage servers. In order to address these problems, we develop I/O middleware systems and matching-based algorithms to map parallel data requests to storage servers such that local and balanced data access can be achieved. The last part of my dissertation presents our plans to improve the performance of interactive data access in big data analysis. Specifically, most interactive analysis programs will scan through the entire data set regardless of which data is actually required. We plan to develop a content-aware method to quickly access required data without this laborious scanning process.
近年来,Hadoop分布式文件系统(HDFS)被部署为许多并行大数据处理系统的基石,如图形处理系统、基于mpi的并行程序和基于scala/java的Spark框架,它可以有效地支持内存中的迭代和交互式数据分析。论文的第一部分主要研究并行数据接入分布式文件系统,如HDFS。由于通常不考虑分布式I/O资源和全局数据分布,因此来自并行进程/执行器的数据请求将不幸地在存储服务器上以远程不平衡的方式提供服务。为了解决这些问题,我们开发了I/O中间件系统和基于匹配的算法,将并行数据请求映射到存储服务器,从而实现本地和平衡的数据访问。论文的最后一部分提出了我们在大数据分析中提高交互数据访问性能的计划。具体来说,大多数交互式分析程序将扫描整个数据集,而不管实际需要哪些数据。我们计划开发一种内容感知的方法来快速访问所需的数据,而无需这种费力的扫描过程。
{"title":"Optimize Parallel Data Access in Big Data Processing","authors":"Jiangling Yin, Jun Wang","doi":"10.1109/CCGrid.2015.168","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.168","url":null,"abstract":"Recent years the Hadoop Distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based Spark frameworks, which can efficiently support iterative and interactive data analysis in memory. The first part of my dissertation mainly focuses on studying parallel data accession distributed file systems, e.g, HDFS. Since the distributed I/O resources and global data distribution are often not taken into consideration, the data requests from parallel processes/executors will unfortunately be served in a remoter imbalanced fashion on the storage servers. In order to address these problems, we develop I/O middleware systems and matching-based algorithms to map parallel data requests to storage servers such that local and balanced data access can be achieved. The last part of my dissertation presents our plans to improve the performance of interactive data access in big data analysis. Specifically, most interactive analysis programs will scan through the entire data set regardless of which data is actually required. We plan to develop a content-aware method to quickly access required data without this laborious scanning process.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"258 1","pages":"721-724"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74937324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Experience Based Sink Placement in Mobile Wireless Sensor Network 移动无线传感器网络中基于经验的Sink放置
Subhra Banerjee, S. Bhunia, N. Mukherjee
In some applications of wireless sensor networks (WSN), sensor nodes are mobile while the sinks are static. In such dynamic environment, situations may arise where many sensor nodes are forwarding data through the same sink node resulting in sink overloading. One of the obvious effects of sink overloading is packet loss. It also indirectly affects the network lifetime in loss-sensitive WSN applications. Therefore, proper placement of sinks in such dynamic environment has a great impact on the performance of WSN applications. Multiple sink placement may not also work in some situations as node density may not be uniform. This paper introduces a sink placement scheme that aims at gathering experiences about sensor node density in region at different times and based on these observations, the scheme proposes candidate sink locations in order to reduce sink overloading. Next, based upon current sensor node density pattern, sinks at these locations are scheduled to active mode, while sinks at remaining candidate locations are scheduled to sleep mode. The second phase is repeated periodically. The scheme is implemented in a simulation environment and compared with another well-known strategy, namely Geographic Sink Placement (GSP). It has been observed that the proposed scheme exhibits better performance with respect to sink overloading and packet loss in comparison with GSP.
在无线传感器网络(WSN)的一些应用中,传感器节点是移动的,而接收器是静态的。在这种动态环境中,可能会出现多个传感器节点通过同一个汇聚节点转发数据,导致汇聚节点过载的情况。接收器过载的一个明显影响是丢包。在对丢失敏感的WSN应用中,它还会间接影响网络的生存期。因此,在这种动态环境中,接收器的正确放置对WSN应用的性能有很大的影响。在某些情况下,由于节点密度可能不均匀,多个sink放置也可能无法工作。本文介绍了一种汇聚点放置方案,该方案旨在收集不同时间区域内传感器节点密度的经验,并根据这些观察结果提出候选汇聚点位置,以减少汇聚点过载。接下来,基于当前传感器节点密度模式,这些位置的接收器被调度到活动模式,而其余候选位置的接收器被调度到睡眠模式。第二阶段周期性地重复。该方案在仿真环境中实现,并与另一种众所周知的策略,即地理Sink放置(GSP)进行了比较。与GSP相比,该方案在sink过载和丢包方面表现出更好的性能。
{"title":"Experience Based Sink Placement in Mobile Wireless Sensor Network","authors":"Subhra Banerjee, S. Bhunia, N. Mukherjee","doi":"10.1109/CCGrid.2015.57","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.57","url":null,"abstract":"In some applications of wireless sensor networks (WSN), sensor nodes are mobile while the sinks are static. In such dynamic environment, situations may arise where many sensor nodes are forwarding data through the same sink node resulting in sink overloading. One of the obvious effects of sink overloading is packet loss. It also indirectly affects the network lifetime in loss-sensitive WSN applications. Therefore, proper placement of sinks in such dynamic environment has a great impact on the performance of WSN applications. Multiple sink placement may not also work in some situations as node density may not be uniform. This paper introduces a sink placement scheme that aims at gathering experiences about sensor node density in region at different times and based on these observations, the scheme proposes candidate sink locations in order to reduce sink overloading. Next, based upon current sensor node density pattern, sinks at these locations are scheduled to active mode, while sinks at remaining candidate locations are scheduled to sleep mode. The second phase is repeated periodically. The scheme is implemented in a simulation environment and compared with another well-known strategy, namely Geographic Sink Placement (GSP). It has been observed that the proposed scheme exhibits better performance with respect to sink overloading and packet loss in comparison with GSP.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"2 1","pages":"898-907"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75585177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Network-Constrained Packing of Brokered Workloads in Virtualized Environments 虚拟化环境中代理工作负载的网络约束包装
Pub Date : 2015-05-04 DOI: 10.1109/CCGRID.2015.110
Christine Bassem, Azer Bestavros
Providing resource allocation with performance predictability guarantees is increasingly important in cloud platforms, especially for data-intensive applications, for which performance depends greatly on the available rates of data transfer between the various computing/storage hosts underlying the virtualized resources assigned to the application. Existing resource allocation solutions either assume that applications manage their data transfer between their virtualized resources, or that cloud providers manage their internal networking resources. With the increased prevalence of brokerage services in cloud platforms, there is a need for resource allocation solutions that provide predictability guarantees in such settings, in which neither application scheduling nor cloud provider resources cane managed/controlled by the broker. This paper addresses this problem, as we define the Network-Constrained Packing (NCP)problem of finding the optimal mapping of brokered resources to applications with guaranteed performance predictability. We prove that NCP is NP-hard, and we define two special instances of the problem, for which exact solutions can be found efficiently. We develop a greedy heuristic to solve the general instance of thence problem, and we evaluate its efficiency using simulations on various application workloads, and network models.
在云平台中,提供具有性能可预测性保证的资源分配越来越重要,特别是对于数据密集型应用程序,因为这些应用程序的性能在很大程度上取决于分配给应用程序的虚拟化资源底层的各种计算/存储主机之间的可用数据传输速率。现有的资源分配解决方案要么假设应用程序管理其虚拟化资源之间的数据传输,要么假设云提供商管理其内部网络资源。随着代理服务在云平台中的日益普及,需要在这种设置中提供可预测性保证的资源分配解决方案,在这种设置中,应用程序调度和云提供商资源都不能由代理管理/控制。本文解决了这个问题,因为我们定义了网络约束包装(NCP)问题,即寻找具有保证性能可预测性的代理资源到应用程序的最佳映射。我们证明了NCP是np困难的,并定义了两个可以有效地找到精确解的特殊实例。我们开发了一种贪心启发式算法来解决一般的贪心启发式问题,并通过对各种应用程序工作负载和网络模型的模拟来评估其效率。
{"title":"Network-Constrained Packing of Brokered Workloads in Virtualized Environments","authors":"Christine Bassem, Azer Bestavros","doi":"10.1109/CCGRID.2015.110","DOIUrl":"https://doi.org/10.1109/CCGRID.2015.110","url":null,"abstract":"Providing resource allocation with performance predictability guarantees is increasingly important in cloud platforms, especially for data-intensive applications, for which performance depends greatly on the available rates of data transfer between the various computing/storage hosts underlying the virtualized resources assigned to the application. Existing resource allocation solutions either assume that applications manage their data transfer between their virtualized resources, or that cloud providers manage their internal networking resources. With the increased prevalence of brokerage services in cloud platforms, there is a need for resource allocation solutions that provide predictability guarantees in such settings, in which neither application scheduling nor cloud provider resources cane managed/controlled by the broker. This paper addresses this problem, as we define the Network-Constrained Packing (NCP)problem of finding the optimal mapping of brokered resources to applications with guaranteed performance predictability. We prove that NCP is NP-hard, and we define two special instances of the problem, for which exact solutions can be found efficiently. We develop a greedy heuristic to solve the general instance of thence problem, and we evaluate its efficiency using simulations on various application workloads, and network models.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"42 1","pages":"149-158"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79315699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC 高性能计算中能效的调度级激励机制
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.101
Yiannis Georgiou, David Glesser, K. Rządca, D. Trystram
Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling algorithm. FairShare is a classic scheduling rule that prioritizes jobs belonging to users who were assigned small amount of CPU-second in the past. Similarly, EFS keeps track of users 'consumption of Watt-seconds and prioritizes those whom jobs consumed less energy. Therefore, EFS incentives users to optimize their code for energy efficiency. Having higher priority, jobs have smaller queuing times and, thus, smaller turn-around time. To validate this principle, we implemented EFS in a scheduling simulator and processed workloads from various HPC centers. The results show that, by reducing it energy consumption, auser will reduce it stretch (slowdown), compared to increasing it energy consumption. To validate the general feasibility odour approach, we also implemented EFS as an extension forSLURM, a popular HPC resource and job management system.We validated our plugin both by emulating a large scale platform, and by experiments upon a real cluster with monitored energy consumption. We observed smaller waiting times for energy efficient users.
能源消耗已经成为高性能计算平台最重要的因素之一。然而,虽然有各种各样的算法和编程技术可以节省能源,但用户目前没有动力使用它们,因为它们可能导致更差的性能。我们提出了一种类似fairshare的调度算法——EnergyFairShare (EFS)来管理超级计算机的能量预算。FairShare是一个经典的调度规则,它优先处理属于过去分配了少量cpu秒的用户的作业。类似地,EFS跟踪用户的瓦特秒消耗,并优先考虑那些工作消耗较少能量的人。因此,EFS鼓励用户优化代码以提高能源效率。具有更高的优先级,作业的排队时间更短,因此周转时间也更短。为了验证这一原理,我们在调度模拟器中实现了EFS,并处理了来自不同HPC中心的工作负载。结果表明,与增加it能耗相比,通过降低it能耗,用户将减少it拉伸(减速)。为了验证通用的可行性方法,我们还实现了EFS作为slurm的扩展,slurm是一种流行的高性能计算资源和作业管理系统。我们通过模拟一个大规模的平台来验证我们的插件,并在一个监控能耗的真实集群上进行实验。我们观察到节能用户的等待时间更短。
{"title":"A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC","authors":"Yiannis Georgiou, David Glesser, K. Rządca, D. Trystram","doi":"10.1109/CCGrid.2015.101","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.101","url":null,"abstract":"Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling algorithm. FairShare is a classic scheduling rule that prioritizes jobs belonging to users who were assigned small amount of CPU-second in the past. Similarly, EFS keeps track of users 'consumption of Watt-seconds and prioritizes those whom jobs consumed less energy. Therefore, EFS incentives users to optimize their code for energy efficiency. Having higher priority, jobs have smaller queuing times and, thus, smaller turn-around time. To validate this principle, we implemented EFS in a scheduling simulator and processed workloads from various HPC centers. The results show that, by reducing it energy consumption, auser will reduce it stretch (slowdown), compared to increasing it energy consumption. To validate the general feasibility odour approach, we also implemented EFS as an extension forSLURM, a popular HPC resource and job management system.We validated our plugin both by emulating a large scale platform, and by experiments upon a real cluster with monitored energy consumption. We observed smaller waiting times for energy efficient users.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"56 3 1","pages":"617-626"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79799329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Cross-Layer SLA Management for Cloud-hosted Big Data Analytics Applications 云托管大数据分析应用的跨层SLA管理
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.175
Xuezhi Zeng, R. Ranjan, P. Strazdins, S. Garg, Lizhe Wang
As we come to terms with various big data challenges, one vital issue remains largely untouched. That is service level agreement (SLA) management to deliver strong Quality of Service (QoS) guarantees for big data analytics applications (BDAA) sharing the same underlying infrastructure, for example, a public cloud platform. Although SLA and QoS are not new concepts as they originated much before the cloud computing and big data era, its importance is amplified and complexity is aggravated by the emergence of time-sensitive BDAAs such as social network-based stock recommendation and environmental monitoring. These applications require strong QoS guarantees and dependability from the underlying cloud computing platform to accommodate real-time responses while handling ever-increasing complexities and uncertainties. Hence, the over-reaching goal of this PhD research is to develop novel simulation, modelling and benchmarking tools and techniques that can aid researchers and practitioners in studying the impact of uncertainties (contention, failures, anomalies, etc.) on the final SLA and QoS of a cloud-hosted BDAA.
当我们面对各种各样的大数据挑战时,有一个至关重要的问题基本上没有被触及。这就是服务水平协议(SLA)管理,为共享相同底层基础设施(例如公共云平台)的大数据分析应用程序(BDAA)提供强大的服务质量(QoS)保证。虽然SLA和QoS并不是一个新概念,因为它们早在云计算和大数据时代之前就已经出现了,但由于基于社交网络的股票推荐和环境监测等具有时间敏感性的bdaa的出现,其重要性被放大,复杂性被加剧。这些应用程序需要底层云计算平台提供强大的QoS保证和可靠性,以便在处理不断增加的复杂性和不确定性的同时适应实时响应。因此,这项博士研究的超额目标是开发新的模拟,建模和基准测试工具和技术,可以帮助研究人员和从业者研究不确定性(争用,故障,异常等)对云托管BDAA的最终SLA和QoS的影响。
{"title":"Cross-Layer SLA Management for Cloud-hosted Big Data Analytics Applications","authors":"Xuezhi Zeng, R. Ranjan, P. Strazdins, S. Garg, Lizhe Wang","doi":"10.1109/CCGrid.2015.175","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.175","url":null,"abstract":"As we come to terms with various big data challenges, one vital issue remains largely untouched. That is service level agreement (SLA) management to deliver strong Quality of Service (QoS) guarantees for big data analytics applications (BDAA) sharing the same underlying infrastructure, for example, a public cloud platform. Although SLA and QoS are not new concepts as they originated much before the cloud computing and big data era, its importance is amplified and complexity is aggravated by the emergence of time-sensitive BDAAs such as social network-based stock recommendation and environmental monitoring. These applications require strong QoS guarantees and dependability from the underlying cloud computing platform to accommodate real-time responses while handling ever-increasing complexities and uncertainties. Hence, the over-reaching goal of this PhD research is to develop novel simulation, modelling and benchmarking tools and techniques that can aid researchers and practitioners in studying the impact of uncertainties (contention, failures, anomalies, etc.) on the final SLA and QoS of a cloud-hosted BDAA.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"765-768"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81474843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Towards Provenance-Based Anomaly Detection in MapReduce MapReduce中基于来源的异常检测
C. Liao, A. Squicciarini
MapReduce enables parallel and distributed processing of vast amount of data on a cluster of machines. However, such computing paradigm is subject to threats posed by malicious and cheating nodes or compromised user submitted code that could tamper data and computation since users maintain little control as the computation is carried out in a distributed fashion. In this paper, we focus on the analysis and detection of anomalies during the process of MapReduce computation. Accordingly, we develop a computational provenance system that captures provenance data related to MapReduce computation within the MapReduce framework in Hadoop. In particular, we identify a set of invariants against aggregated provenance information, which are later analyzed to uncover anomalies indicating possible tampering of data and computation. We conduct a series of experiments to show the efficiency and effectiveness of our proposed provenance system.
MapReduce支持在机器集群上并行和分布式处理大量数据。然而,这种计算范式受到恶意和欺骗节点或受损用户提交的代码所构成的威胁,这些代码可能篡改数据和计算,因为在以分布式方式进行计算时,用户几乎没有控制权。本文主要研究MapReduce计算过程中的异常分析和检测。因此,我们开发了一个计算溯源系统,在Hadoop的MapReduce框架内捕获与MapReduce计算相关的溯源数据。特别是,我们针对聚合的来源信息确定了一组不变量,随后对这些不变量进行分析以发现指示可能篡改数据和计算的异常。我们进行了一系列的实验来证明我们提出的种源系统的效率和有效性。
{"title":"Towards Provenance-Based Anomaly Detection in MapReduce","authors":"C. Liao, A. Squicciarini","doi":"10.1109/CCGrid.2015.16","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.16","url":null,"abstract":"MapReduce enables parallel and distributed processing of vast amount of data on a cluster of machines. However, such computing paradigm is subject to threats posed by malicious and cheating nodes or compromised user submitted code that could tamper data and computation since users maintain little control as the computation is carried out in a distributed fashion. In this paper, we focus on the analysis and detection of anomalies during the process of MapReduce computation. Accordingly, we develop a computational provenance system that captures provenance data related to MapReduce computation within the MapReduce framework in Hadoop. In particular, we identify a set of invariants against aggregated provenance information, which are later analyzed to uncover anomalies indicating possible tampering of data and computation. We conduct a series of experiments to show the efficiency and effectiveness of our proposed provenance system.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"11 1","pages":"647-656"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83022187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
mAMBER: Accelerating Explicit Solvent Molecular Dynamic with Intel Xeon Phi Many-Integrated Core Coprocessors 成员:加速显式溶剂分子动力学与英特尔至强Phi多集成核心协处理器
Xin Liu, Shaoliang Peng, Canqun Yang, Chengkun Wu, Haiqiang Wang, Qian Cheng, Weiliang Zhu, Jinan Wang
Molecular dynamics (MD) is a computer simulation of physical movements of atoms and molecules, which is a very important research technique for the study of biological and chemical systems at micro-scale. Assisted Model Building with Energy Refinement (AMBER) is one of the most commonly used software for MD. However, the microsecond MD simulation of large-scale atom system requires a lot of computation power. In this paper, we propose mAMBER: an Intel Xeon Phi Many-Integrated Core (MIC) Coprocessors accelerated implementation of explicit solvent all-atom classical molecular dynamics (MD) within the AMBER program package. We mAMBER also includes new parallel algorithm using CPUs and MIC coprocessors on Tianhe-2 supercomputer. With several optimizing techniques including CPU/MIC collaborated parallelization, factorization and asynchronous data transfer framework, we can accelerate the sander program of AMBER (version 12) in 'offload' mode, and achieves a 4.17-fold overall speedup compared with the CPU-only sander program.
分子动力学(Molecular dynamics, MD)是对原子和分子物理运动的计算机模拟,是研究微尺度生物和化学系统的重要研究技术。AMBER (Assisted Model Building with Energy Refinement)是最常用的原子动力学仿真软件之一,但大规模原子系统的微秒级原子动力学仿真需要大量的计算能力。在本文中,我们提出了mAMBER: Intel Xeon Phi多集成核心(MIC)协处理器,在AMBER程序包中加速实现显式溶剂全原子经典分子动力学(MD)。我们的mAMBER还包括使用天河二号超级计算机上的cpu和MIC协处理器的新型并行算法。通过CPU/MIC协同并行化、因式分解和异步数据传输框架等优化技术,我们可以在“卸载”模式下加速AMBER(12版)的sander程序,与仅使用CPU的sander程序相比,总体速度提高了4.17倍。
{"title":"mAMBER: Accelerating Explicit Solvent Molecular Dynamic with Intel Xeon Phi Many-Integrated Core Coprocessors","authors":"Xin Liu, Shaoliang Peng, Canqun Yang, Chengkun Wu, Haiqiang Wang, Qian Cheng, Weiliang Zhu, Jinan Wang","doi":"10.1109/CCGrid.2015.66","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.66","url":null,"abstract":"Molecular dynamics (MD) is a computer simulation of physical movements of atoms and molecules, which is a very important research technique for the study of biological and chemical systems at micro-scale. Assisted Model Building with Energy Refinement (AMBER) is one of the most commonly used software for MD. However, the microsecond MD simulation of large-scale atom system requires a lot of computation power. In this paper, we propose mAMBER: an Intel Xeon Phi Many-Integrated Core (MIC) Coprocessors accelerated implementation of explicit solvent all-atom classical molecular dynamics (MD) within the AMBER program package. We mAMBER also includes new parallel algorithm using CPUs and MIC coprocessors on Tianhe-2 supercomputer. With several optimizing techniques including CPU/MIC collaborated parallelization, factorization and asynchronous data transfer framework, we can accelerate the sander program of AMBER (version 12) in 'offload' mode, and achieves a 4.17-fold overall speedup compared with the CPU-only sander program.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"19 1","pages":"729-732"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85921094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Quantitative Musings on the Feasibility of Smartphone Clouds 智能手机云可行性的定量思考
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.115
Cheng Chen, M. Ehsan, R. Sion
"Green" and its "low power" cousin are the new hot spots in computing. In cloud data centers, at scale, ideas of deploying low-power ARM architectures or even large numbers of extremely "wimpy" nodes [1, 2] seem increasingly appealing. Skeptics on the other hand maintain that we cannot get more than what we pay for and no free lunches can be had. In this paper we explore these theses and provide insights into the power-performance trade-off at scale for "wimpy", back-to basics, power-efficient RISC architectures. We use ARM as modern proxy for these and quantify the cost/performance ratio precisely-enough to allow for a broader conclusion. We then offer an intuition as to why this may still hold in 2030.
“绿色”和“低功耗”是计算机领域的新热点。在云数据中心,在规模上,部署低功耗ARM架构或甚至大量极其“软弱”的节点的想法似乎越来越有吸引力。另一方面,怀疑论者坚持认为,我们不能得到比我们付出的更多,没有免费的午餐。在本文中,我们探讨了这些论点,并提供了对“弱”,回归基础,节能RISC架构的大规模功率性能权衡的见解。我们使用ARM作为这些的现代代理,并精确地量化成本/性能比——足以允许更广泛的结论。然后,我们提供了一种直觉,说明为什么这种情况在2030年可能仍然成立。
{"title":"Quantitative Musings on the Feasibility of Smartphone Clouds","authors":"Cheng Chen, M. Ehsan, R. Sion","doi":"10.1109/CCGrid.2015.115","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.115","url":null,"abstract":"\"Green\" and its \"low power\" cousin are the new hot spots in computing. In cloud data centers, at scale, ideas of deploying low-power ARM architectures or even large numbers of extremely \"wimpy\" nodes [1, 2] seem increasingly appealing. Skeptics on the other hand maintain that we cannot get more than what we pay for and no free lunches can be had. In this paper we explore these theses and provide insights into the power-performance trade-off at scale for \"wimpy\", back-to basics, power-efficient RISC architectures. We use ARM as modern proxy for these and quantify the cost/performance ratio precisely-enough to allow for a broader conclusion. We then offer an intuition as to why this may still hold in 2030.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"17 1","pages":"535-544"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88431069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Locality-Aware Stencil Computations Using Flash SSDs as Main Memory Extension 基于Flash ssd的位置感知模板计算
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.126
H. Midorikawa, Hideyuki Tan
This paper investigates the performance of flash solid state drives (SSDs) as an extension to main memory with a locality-aware algorithm for stencil computations. We propose three different configurations, swap, m map, and aio, for accessing the flash media, with data structure blocking techniques. Our results indicate that hierarchical blocking optimizations for three tiers, flash SSD, DRAM, and cache, perform satisfactorily to bridge the DRAM-flash latency divide. Using only 32 GiB of DRAM and a flash SSD, with 7-point stencil computations for a 512 GiB problem (16 times that of the DRAM), 87% of the Mflops execution performance achieved with DRAM only was attained.
本文研究了flash固态硬盘(ssd)作为主存储器的扩展,使用位置感知算法进行模板计算的性能。我们提出了三种不同的配置,swap、m map和aio,用于使用数据结构阻塞技术访问闪存介质。我们的研究结果表明,闪存SSD、DRAM和缓存这三层的分层阻塞优化在弥合DRAM-闪存延迟鸿沟方面表现令人满意。仅使用32gb的DRAM和一个闪存SSD,对512 gb的问题进行7点模板计算(是DRAM的16倍),实现了仅使用DRAM时所达到的87%的Mflops执行性能。
{"title":"Locality-Aware Stencil Computations Using Flash SSDs as Main Memory Extension","authors":"H. Midorikawa, Hideyuki Tan","doi":"10.1109/CCGrid.2015.126","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.126","url":null,"abstract":"This paper investigates the performance of flash solid state drives (SSDs) as an extension to main memory with a locality-aware algorithm for stencil computations. We propose three different configurations, swap, m map, and aio, for accessing the flash media, with data structure blocking techniques. Our results indicate that hierarchical blocking optimizations for three tiers, flash SSD, DRAM, and cache, perform satisfactorily to bridge the DRAM-flash latency divide. Using only 32 GiB of DRAM and a flash SSD, with 7-point stencil computations for a 512 GiB problem (16 times that of the DRAM), 87% of the Mflops execution performance achieved with DRAM only was attained.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"1163-1168"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88567104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Structured Light 3D Measurement System Based on Heterogeneous Parallel Computation Model 基于异构并行计算模型的结构光三维测量系统
Xiaoyu Liu, Hao Sheng, Yang Zhang, Z. Xiong
We present a structured light measurement system to collect high accuracy surface information of the measured object with a good real-time performance. Utilizing phase-shifting method in conjunction with a matching method proposed in this paper which can significantly reduce the noisy points, we can achieve high accuracy and noiseless point cloud in a complex industrial environment. Due to the use of the heterogeneous parallel computation model, the parallelism of the algorithm is developed in a deep way. The OpenMP+CUDA hybrid computing model is then used in the system to get a better real-time performance.
提出了一种结构光测量系统,可以高精度地采集被测物体的表面信息,并具有良好的实时性。将相移法与本文提出的匹配方法相结合,可以显著降低噪声点,从而在复杂的工业环境中实现高精度、无噪声的点云。由于采用了异构并行计算模型,使得算法的并行性得到了深入的发展。系统采用了OpenMP+CUDA混合计算模型,获得了更好的实时性。
{"title":"A Structured Light 3D Measurement System Based on Heterogeneous Parallel Computation Model","authors":"Xiaoyu Liu, Hao Sheng, Yang Zhang, Z. Xiong","doi":"10.1109/CCGrid.2015.69","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.69","url":null,"abstract":"We present a structured light measurement system to collect high accuracy surface information of the measured object with a good real-time performance. Utilizing phase-shifting method in conjunction with a matching method proposed in this paper which can significantly reduce the noisy points, we can achieve high accuracy and noiseless point cloud in a complex industrial environment. Due to the use of the heterogeneous parallel computation model, the parallelism of the algorithm is developed in a deep way. The OpenMP+CUDA hybrid computing model is then used in the system to get a better real-time performance.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"34 1","pages":"1027-1036"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77045809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1