首页 > 最新文献

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献

英文 中文
Big Data Provenance Analysis and Visualization 大数据来源分析与可视化
Peng Chen, Beth Plale
Provenance captured from E-Science experimentation is often large and complex, for instance, from agent-based simulations that have tens of thousands of heterogeneous components interacting over extended time periods. The subject of study of my dissertation is the use of E-Science provenance at scale. My initial research studied the visualization of large provenance graphs and proposed an abstract representation of provenance that supports useful data mining. Recent work involves analyzing large provenance data generated from agent-based simulations on a single machine. In continuation, I propose stream processing techniques to support the continuous and real-time analysis of data provenance, which is captured from agent based simulations on HPC and thus has unprecedented volume and complexity.
从E-Science实验中捕获的来源通常是庞大而复杂的,例如,从基于代理的模拟中,有成千上万的异质组件在很长一段时间内相互作用。我论文的研究主题是大规模使用E-Science溯源。我最初的研究研究了大型来源图的可视化,并提出了一个来源的抽象表示,支持有用的数据挖掘。最近的工作涉及分析单个机器上基于代理的模拟产生的大量来源数据。接着,我提出了流处理技术来支持数据来源的连续和实时分析,这是从基于代理的HPC模拟中捕获的,因此具有前所未有的体积和复杂性。
{"title":"Big Data Provenance Analysis and Visualization","authors":"Peng Chen, Beth Plale","doi":"10.1109/CCGrid.2015.85","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.85","url":null,"abstract":"Provenance captured from E-Science experimentation is often large and complex, for instance, from agent-based simulations that have tens of thousands of heterogeneous components interacting over extended time periods. The subject of study of my dissertation is the use of E-Science provenance at scale. My initial research studied the visualization of large provenance graphs and proposed an abstract representation of provenance that supports useful data mining. Recent work involves analyzing large provenance data generated from agent-based simulations on a single machine. In continuation, I propose stream processing techniques to support the continuous and real-time analysis of data provenance, which is captured from agent based simulations on HPC and thus has unprecedented volume and complexity.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"20 1","pages":"797-800"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73678529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Towards Context-Aware Mobile Crowdsensing in Vehicular Social Networks 面向车载社交网络的情境感知移动人群感知
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.155
Xiping Hu, Victor C. M. Leung
Driving is an integral part of our everyday lives, and the average driving time of people globally is increasing to 84 minutes everyday, which is a time when people are uniquely vulnerable. A number of research works have identified that mobile crowd sensing in vehicular social networks (VSNs) can be effectively used for many purposes and bring huge economic benefits, e.g., safety improvement and traffic management. This paper presents our effort that toward context-aware mobile crowd sensing in VSNs. First, we introduce a novel application-oriented service collaboration (ASCM) model which can automatically match multiple users with multiple mobile crowd sensing tasks in VSNs in an efficient manner. After that, for users' dynamic contexts of VSNs, we proposes a context information management model, that aims to enable the mobile crowd sensing applications to autonomously match appropriate service and information with different users (requesters and participants) in crowdsensing.
驾驶是我们日常生活中不可或缺的一部分,全球人们的平均驾驶时间正在增加到每天84分钟,这是人们特别脆弱的时候。大量研究表明,车辆社交网络(VSNs)中的移动人群传感可以有效地用于许多目的,并带来巨大的经济效益,例如提高安全性和交通管理。本文介绍了我们在虚拟网络环境感知移动人群感知方面所做的努力。首先,我们引入了一种新的面向应用的服务协作(ASCM)模型,该模型可以有效地自动匹配多个用户和多个移动人群感知任务。在此基础上,针对虚拟社交网络用户的动态情境,提出了情境信息管理模型,使移动人群感知应用能够自主地为不同的人群感知用户(请求者和参与者)匹配合适的服务和信息。
{"title":"Towards Context-Aware Mobile Crowdsensing in Vehicular Social Networks","authors":"Xiping Hu, Victor C. M. Leung","doi":"10.1109/CCGrid.2015.155","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.155","url":null,"abstract":"Driving is an integral part of our everyday lives, and the average driving time of people globally is increasing to 84 minutes everyday, which is a time when people are uniquely vulnerable. A number of research works have identified that mobile crowd sensing in vehicular social networks (VSNs) can be effectively used for many purposes and bring huge economic benefits, e.g., safety improvement and traffic management. This paper presents our effort that toward context-aware mobile crowd sensing in VSNs. First, we introduce a novel application-oriented service collaboration (ASCM) model which can automatically match multiple users with multiple mobile crowd sensing tasks in VSNs in an efficient manner. After that, for users' dynamic contexts of VSNs, we proposes a context information management model, that aims to enable the mobile crowd sensing applications to autonomously match appropriate service and information with different users (requesters and participants) in crowdsensing.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"72 1","pages":"749-752"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73689067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture 面向吞吐量的GPGPU架构的ILP设计重述
Ping Xiang, Yi Yang, Mike Mantor, Norman Rubin, Huiyang Zhou
Many-core architectures such as graphics processing units (GPUs) rely on thread-level parallelism (TLP)to overcome pipeline hazards. Consequently, each core in a many-core processor employs a relatively simple in-order pipeline with limited capability to exploit instruction-level parallelism (ILP). In this paper, we study the ILP impact on the throughput-oriented many-core architecture, including data bypassing, score boarding and branch prediction. We show that these ILP techniques significantly reduce the performance dependency on TLP. This is especially useful for applications, whose resource usage limits the hardware to run a high number of threads concurrently. Furthermore, ILP techniques reduce the demand on on-chip resource to support high TLP. Given the workload-dependent impact from ILP, we propose heterogeneous GPGPU architecture, consisting of both the cores designed for high TLP and those customized with ILPtechniques. Our results show that our heterogeneous GPUarchitecture achieves high throughput as well as high energy and area-efficiency compared to homogenous designs.
许多核心架构,如图形处理单元(gpu)依赖于线程级并行性(TLP)来克服管道危险。因此,多核处理器中的每个核心都使用相对简单的顺序管道,其利用指令级并行性(ILP)的能力有限。在本文中,我们研究了ILP对面向吞吐量的多核架构的影响,包括数据绕过、分数板和分支预测。我们表明,这些ILP技术显著降低了对TLP的性能依赖。这对于应用程序特别有用,因为应用程序的资源使用限制了硬件并发运行大量线程。此外,ILP技术减少了对片上资源的需求,以支持高TLP。考虑到ILP对工作负载的影响,我们提出了异构GPGPU架构,包括为高TLP设计的内核和使用ILP技术定制的内核。我们的结果表明,与同质设计相比,我们的异构gpu架构实现了高吞吐量以及高能量和面积效率。
{"title":"Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture","authors":"Ping Xiang, Yi Yang, Mike Mantor, Norman Rubin, Huiyang Zhou","doi":"10.1109/CCGRID.2015.14","DOIUrl":"https://doi.org/10.1109/CCGRID.2015.14","url":null,"abstract":"Many-core architectures such as graphics processing units (GPUs) rely on thread-level parallelism (TLP)to overcome pipeline hazards. Consequently, each core in a many-core processor employs a relatively simple in-order pipeline with limited capability to exploit instruction-level parallelism (ILP). In this paper, we study the ILP impact on the throughput-oriented many-core architecture, including data bypassing, score boarding and branch prediction. We show that these ILP techniques significantly reduce the performance dependency on TLP. This is especially useful for applications, whose resource usage limits the hardware to run a high number of threads concurrently. Furthermore, ILP techniques reduce the demand on on-chip resource to support high TLP. Given the workload-dependent impact from ILP, we propose heterogeneous GPGPU architecture, consisting of both the cores designed for high TLP and those customized with ILPtechniques. Our results show that our heterogeneous GPUarchitecture achieves high throughput as well as high energy and area-efficiency compared to homogenous designs.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"20 1","pages":"121-130"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78814355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Lessons Learned Implementing User-Level Failure Mitigation in MPICH 在MPICH中实施用户级故障缓解的经验教训
Wesley Bland, Huiwei Lu, Sangmin Seo, P. Balaji
User-level failure mitigation (ULFM) is becoming the front-running solution for process fault tolerance in MPI. While not yet adopted into the MPI standard, it is being used by applications and libraries and is being considered by the MPI Forum for future inclusion into MPI itself. In this paper, we introduce an implementation of ULFM in MPICH, a high-performance and widely portable implementation of the MPI standard. We demonstrate that while still a reference implementation, the runtime cost of the new API calls introduced is relatively low.
用户级故障缓解(ULFM)正在成为MPI中进程容错的首选解决方案。虽然还没有被纳入MPI标准,但它正在被应用程序和库使用,并且MPI论坛正在考虑将来将其纳入MPI本身。本文介绍了一种在MPICH中实现ULFM的方法,MPICH是MPI标准的一种高性能且可广泛移植的实现。我们证明,虽然仍然是参考实现,但引入的新API调用的运行时成本相对较低。
{"title":"Lessons Learned Implementing User-Level Failure Mitigation in MPICH","authors":"Wesley Bland, Huiwei Lu, Sangmin Seo, P. Balaji","doi":"10.1109/CCGrid.2015.51","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.51","url":null,"abstract":"User-level failure mitigation (ULFM) is becoming the front-running solution for process fault tolerance in MPI. While not yet adopted into the MPI standard, it is being used by applications and libraries and is being considered by the MPI Forum for future inclusion into MPI itself. In this paper, we introduce an implementation of ULFM in MPICH, a high-performance and widely portable implementation of the MPI standard. We demonstrate that while still a reference implementation, the runtime cost of the new API calls introduced is relatively low.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"15 1","pages":"1123-1126"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78499721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Full Integrity and Freshness for Outsourced Storage 外包储存的完整性和新鲜度
Hao Jin, Hong Jiang, Ke Zhou, Ronglei Wei, Dongliang Lei, Ping Huang
Data outsourcing relieves cloud users of the heavy burden of infrastructure management and maintenance. However, the handover of data control to untrusted cloud servers significantly complicates the security issues. Conventional signature verification widely adopted in cryptographic storage system only guarantees the integrity of retrieved data, for those rarely or never accessed data, it does not work. This paper integrates proof of storage technique with data dynamics support into cryptographic storage design to provide full integrity for outsourced data. Besides, we provide instantaneously freshness check for retrieved data to defend against potential replay attacks. We achieve these goals by designing flexible block structures and combining broadcast encryption, key regression, Merkle hash tree, proof of storage and fine-grained access control policies together to provide a secure storage service for outsourced data. Experimental evaluation of our prototype shows that the cryptographic cost and throughput is reasonable and acceptable.
数据外包减轻了云用户繁重的基础设施管理和维护负担。但是,将数据控制移交给不受信任的云服务器会使安全问题变得非常复杂。在加密存储系统中广泛采用的传统签名验证方法只能保证检索数据的完整性,对于那些很少被访问或从未被访问过的数据,签名验证方法不起作用。本文将数据动态支持的存储证明技术集成到加密存储设计中,为外包数据提供完全的完整性。此外,我们为检索数据提供即时新鲜度检查,以防御潜在的重放攻击。我们通过设计灵活的块结构,并将广播加密、密钥回归、默克尔哈希树、存储证明和细粒度访问控制策略结合在一起,为外包数据提供安全的存储服务,从而实现这些目标。实验评估表明,我们的原型的加密成本和吞吐量是合理和可接受的。
{"title":"Full Integrity and Freshness for Outsourced Storage","authors":"Hao Jin, Hong Jiang, Ke Zhou, Ronglei Wei, Dongliang Lei, Ping Huang","doi":"10.1109/CCGrid.2015.90","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.90","url":null,"abstract":"Data outsourcing relieves cloud users of the heavy burden of infrastructure management and maintenance. However, the handover of data control to untrusted cloud servers significantly complicates the security issues. Conventional signature verification widely adopted in cryptographic storage system only guarantees the integrity of retrieved data, for those rarely or never accessed data, it does not work. This paper integrates proof of storage technique with data dynamics support into cryptographic storage design to provide full integrity for outsourced data. Besides, we provide instantaneously freshness check for retrieved data to defend against potential replay attacks. We achieve these goals by designing flexible block structures and combining broadcast encryption, key regression, Merkle hash tree, proof of storage and fine-grained access control policies together to provide a secure storage service for outsourced data. Experimental evaluation of our prototype shows that the cryptographic cost and throughput is reasonable and acceptable.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"127 1","pages":"362-371"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76152383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy Profiling Using IgProf 使用IgProf进行能量分析
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.118
Kashif Nizam Khan, Filip Nyback, Zhonghong Ou, J. Nurminen, T. Niemi, G. Eulisse, P. Elmer, David Abdurachmanov
Energy efficiency has become a primary concern for data centers in recent years. Understanding where the energy has been spent within a software is fundamental for energy-efficiency study as a whole. In this paper, we take the first step towards this direction by building an energy profiling module on top of IgProf. IgProf is an application profiler developed at CERN for scientific computing workloads. The energy profiling module is based on sampling and obtains energy measurements from the Running Average Power Limit (RAPL) interface present on the latest Intel processors. The initial profiling results of a single-threaded program demonstrates potential, showing a close correlation between the execution time and the energy spent within a function.
近年来,能源效率已成为数据中心的主要关注点。了解能源在软件中的使用情况是整体能效研究的基础。在本文中,我们通过在IgProf之上构建一个能量分析模块,向这个方向迈出了第一步。IgProf是欧洲核子研究中心为科学计算工作负载开发的应用程序分析器。能量分析模块基于采样,并从最新英特尔处理器上的运行平均功率限制(RAPL)接口获得能量测量。单线程程序的初始分析结果显示了潜力,显示了执行时间和函数内花费的精力之间的密切关联。
{"title":"Energy Profiling Using IgProf","authors":"Kashif Nizam Khan, Filip Nyback, Zhonghong Ou, J. Nurminen, T. Niemi, G. Eulisse, P. Elmer, David Abdurachmanov","doi":"10.1109/CCGrid.2015.118","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.118","url":null,"abstract":"Energy efficiency has become a primary concern for data centers in recent years. Understanding where the energy has been spent within a software is fundamental for energy-efficiency study as a whole. In this paper, we take the first step towards this direction by building an energy profiling module on top of IgProf. IgProf is an application profiler developed at CERN for scientific computing workloads. The energy profiling module is based on sampling and obtains energy measurements from the Running Average Power Limit (RAPL) interface present on the latest Intel processors. The initial profiling results of a single-threaded program demonstrates potential, showing a close correlation between the execution time and the energy spent within a function.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"15 1","pages":"1115-1118"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79339526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Modeling Gather and Scatter with Hardware Performance Counters for Xeon Phi 建模收集和分散与硬件性能计数器为Xeon Phi
James Lin, Akira Nukada, S. Matsuoka
Intel Initial Many-Core Instructions (IMCI) for Xeon Phi introduces hardware-implemented Gather and Scatter (G/S) load/store contents of SIMD registers from/to non-contiguous memory locations. However, they can be one of key performance bottlenecks for Xeon Phi. Modelling G/S can provide insights to the performance on Xeon Phi, however, the existing solution needs a hand-written assembly implementation. Therefore, we modeled G/S with hardware performance counters which can be profiled by the tools like PAPI. We profiled Address Generation Interlock (AGI) events as the number of G/S, estimated the average latency of G/S with VPU_DATA_READ, and combined them to model the total latencies of G/S. We applied our model to the 3D 7-point stencil and the result showed G/S spent nearly 40% of total kernel time. We also validated the model by implementing a G/S- free version with intrinsics. The contribution of the work is a performance model for G/S built with hardware counters. We believe the model can be generally applicable to CPU as well.
Intel初始多核指令(IMCI)为Xeon Phi引入了硬件实现的收集和分散(G/S)从/到非连续内存位置加载/存储SIMD寄存器的内容。然而,它们可能是Xeon Phi的关键性能瓶颈之一。建模G/S可以提供对Xeon Phi处理器性能的洞察,然而,现有的解决方案需要手工编写的汇编实现。因此,我们使用硬件性能计数器对G/S进行建模,这些计数器可以通过PAPI等工具进行分析。我们将Address Generation Interlock (AGI)事件描述为G/S的数量,使用VPU_DATA_READ估计G/S的平均延迟,并将它们结合起来建模G/S的总延迟。我们将我们的模型应用于3D 7点模板,结果显示G/S花费了近40%的总内核时间。我们还通过实现带有intrinsic的无G/S版本来验证该模型。这项工作的贡献是一个用硬件计数器构建的G/S性能模型。我们相信该模型也可以普遍适用于CPU。
{"title":"Modeling Gather and Scatter with Hardware Performance Counters for Xeon Phi","authors":"James Lin, Akira Nukada, S. Matsuoka","doi":"10.1109/CCGrid.2015.59","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.59","url":null,"abstract":"Intel Initial Many-Core Instructions (IMCI) for Xeon Phi introduces hardware-implemented Gather and Scatter (G/S) load/store contents of SIMD registers from/to non-contiguous memory locations. However, they can be one of key performance bottlenecks for Xeon Phi. Modelling G/S can provide insights to the performance on Xeon Phi, however, the existing solution needs a hand-written assembly implementation. Therefore, we modeled G/S with hardware performance counters which can be profiled by the tools like PAPI. We profiled Address Generation Interlock (AGI) events as the number of G/S, estimated the average latency of G/S with VPU_DATA_READ, and combined them to model the total latencies of G/S. We applied our model to the 3D 7-point stencil and the result showed G/S spent nearly 40% of total kernel time. We also validated the model by implementing a G/S- free version with intrinsics. The contribution of the work is a performance model for G/S built with hardware counters. We believe the model can be generally applicable to CPU as well.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"40 1","pages":"713-716"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84557296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SparkSW: Scalable Distributed Computing System for Large-Scale Biological Sequence Alignment SparkSW:大规模生物序列比对可扩展分布式计算系统
Guoguang Zhao, Cheng Ling, Donghong Sun
The Smith-Waterman (SW) algorithm is universally used for a database search owing to its high sensitively. The widespread impact of the algorithm is reflected in over 8000 citations that the algorithm has received in the past decades. However, the algorithm is prohibitively high in terms of time and space complexity, and so poses significant computational challenges. Apache Spark is an increasingly popular fast big data analytics engine, which has been highly successful in implementing large-scale data-intensive applications on commercial hardware. This paper presents the first ever reported system that implements the SW algorithm on Apache Spark based distributed computing framework, with a couple of off-the-shelf workstations, which is named as SparkSW. The scalability and load-balancing efficiency of the system are investigated by realistic ultra-large database from the state-of-the-art UniRef100. The experimental results indicate that 1) SparkSW is load-balancing for parallel adaptive on workloads and scales extremely well with the increases of computing resource, 2) SparkSW provides a fast and universal option high sensitively biological sequence alignments. The success of SparkSW also reveals that Apache Spark framework provides an efficient solution to facilitate coping with ever increasing sizes of biological sequence databases, especially generated by second-generation sequencing technologies.
Smith-Waterman (SW)算法由于其高灵敏度而被广泛用于数据库搜索。该算法的广泛影响反映在该算法在过去几十年中收到的8000多次引用中。然而,该算法在时间和空间复杂度方面过高,因此提出了重大的计算挑战。Apache Spark是一个日益流行的快速大数据分析引擎,它在商业硬件上实现大规模数据密集型应用方面非常成功。本文提出了第一个基于Apache Spark的分布式计算框架实现软件算法的系统,该系统使用了几个现成的工作站,命名为SparkSW。采用最先进的UniRef100超大数据库对系统的可扩展性和负载均衡效率进行了研究。实验结果表明:1)SparkSW具有负载均衡的并行自适应能力,并且随着计算资源的增加具有良好的可扩展性;2)SparkSW提供了一种快速、通用的高灵敏度生物序列比对选择。SparkSW的成功也表明,Apache Spark框架提供了一个有效的解决方案,以方便应对不断增长的生物序列数据库,特别是由第二代测序技术生成的数据库。
{"title":"SparkSW: Scalable Distributed Computing System for Large-Scale Biological Sequence Alignment","authors":"Guoguang Zhao, Cheng Ling, Donghong Sun","doi":"10.1109/CCGrid.2015.55","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.55","url":null,"abstract":"The Smith-Waterman (SW) algorithm is universally used for a database search owing to its high sensitively. The widespread impact of the algorithm is reflected in over 8000 citations that the algorithm has received in the past decades. However, the algorithm is prohibitively high in terms of time and space complexity, and so poses significant computational challenges. Apache Spark is an increasingly popular fast big data analytics engine, which has been highly successful in implementing large-scale data-intensive applications on commercial hardware. This paper presents the first ever reported system that implements the SW algorithm on Apache Spark based distributed computing framework, with a couple of off-the-shelf workstations, which is named as SparkSW. The scalability and load-balancing efficiency of the system are investigated by realistic ultra-large database from the state-of-the-art UniRef100. The experimental results indicate that 1) SparkSW is load-balancing for parallel adaptive on workloads and scales extremely well with the increases of computing resource, 2) SparkSW provides a fast and universal option high sensitively biological sequence alignments. The success of SparkSW also reveals that Apache Spark framework provides an efficient solution to facilitate coping with ever increasing sizes of biological sequence databases, especially generated by second-generation sequencing technologies.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"22 1","pages":"845-852"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85087796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Modeling Cross-Architecture Co-Tenancy Performance Interference 跨架构共租性能干扰建模
Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.152
Wei Kuang, Laura E. Brown, Zhenlin Wang
Cloud computing has become a dominant computing paradigm to provide elastic, affordable computing resources to end users. Due to the increased computing power of modern machines powered by multi/many-core computing, data centers often co-locate multiple virtual machines (VMs) into one physical machine, resulting in co-tenancy, and resource sharing and competition. Applications or VMs co-locating in one physical machine can interfere with each other despite of the promise of performance isolation through virtualization. Modelling and predicting co-run interference therefore becomes critical for data center job scheduling and QoS (Quality of Service) assurance. Co-run interference can be categorized into two metrics, sensitivity and pressure, where the former denotes how an application's performance is affected by its co-run applications, and the latter measures how it impacts the performance of its co-run applications. This paper shows that sensitivity and pressure are both application-and architecture dependent. Further, we propose a regression model that predicts an application's sensitivity and pressure across architectures with high accuracy. This regression model enables a data center scheduler to guarantee the QoS of a VM/application when it is scheduled to co-locate with another VMs/applications.
云计算已经成为为最终用户提供弹性的、负担得起的计算资源的主要计算范例。由于由多核/多核计算驱动的现代机器的计算能力不断提高,数据中心通常将多个虚拟机(vm)共同定位到一台物理机器中,从而导致共租、资源共享和竞争。尽管通过虚拟化可以实现性能隔离,但位于同一物理机器中的应用程序或虚拟机可能会相互干扰。因此,建模和预测协同运行干扰对于数据中心作业调度和QoS(服务质量)保证至关重要。共同运行干扰可以分为两个指标,灵敏度和压力,其中前者表示应用程序的性能如何受到其共同运行的应用程序的影响,后者衡量它如何影响其共同运行的应用程序的性能。本文表明,灵敏度和压力都依赖于应用和结构。此外,我们提出了一个回归模型,该模型可以高精度地预测应用程序跨架构的灵敏度和压力。这个回归模型使数据中心调度器能够在一个VM/应用程序被安排与另一个VM/应用程序共定位时保证它的QoS。
{"title":"Modeling Cross-Architecture Co-Tenancy Performance Interference","authors":"Wei Kuang, Laura E. Brown, Zhenlin Wang","doi":"10.1109/CCGrid.2015.152","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.152","url":null,"abstract":"Cloud computing has become a dominant computing paradigm to provide elastic, affordable computing resources to end users. Due to the increased computing power of modern machines powered by multi/many-core computing, data centers often co-locate multiple virtual machines (VMs) into one physical machine, resulting in co-tenancy, and resource sharing and competition. Applications or VMs co-locating in one physical machine can interfere with each other despite of the promise of performance isolation through virtualization. Modelling and predicting co-run interference therefore becomes critical for data center job scheduling and QoS (Quality of Service) assurance. Co-run interference can be categorized into two metrics, sensitivity and pressure, where the former denotes how an application's performance is affected by its co-run applications, and the latter measures how it impacts the performance of its co-run applications. This paper shows that sensitivity and pressure are both application-and architecture dependent. Further, we propose a regression model that predicts an application's sensitivity and pressure across architectures with high accuracy. This regression model enables a data center scheduler to guarantee the QoS of a VM/application when it is scheduled to co-locate with another VMs/applications.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"194 1","pages":"231-240"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85473011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer 基因组大数据分析软件在TH-2超级计算机上的扩展挑战
Shaoliang Peng, Xiangke Liao, Canqun Yang, Yutong Lu, Jie Liu, Yingbo Cui, Heng Wang, Chengkun Wu, Bingqiang Wang
Whole genome re-sequencing plays a crucial role in biomedical studies. The emergence of genomic big data calls for an enormous amount of computing power. However, current computational methods are inefficient in utilizing available computational resources. In this paper, we address this challenge by optimizing the utilization of the fastest supercomputer in the world - TH-2 supercomputer. TH-2 is featured by its neo-heterogeneous architecture, in which each compute node is equipped with 2 Intel Xeon CPUs and 3 Intel Xeon Phi coprocessors. The heterogeneity and the massive amount of data to be processed pose great challenges for the deployment of the genome analysis software pipeline on TH-2. Runtime profiling shows that SOAP3-dp and SOAPsnp are the most time-consuming components (up to 70% of total runtime) in a typical genome-analyzing pipeline. To optimize the whole pipeline, we first devise a number of parallel and optimization strategies for SOAP3-dp and SOAPsnp, respectively targeting each node to fully utilize all sorts of hardware resources provided both by CPU and MIC. We also employ a few scaling methods to reduce communication between different nodes. We then scaled up our method on TH-2. With 8192 nodes, the whole analyzing procedure took 8.37 hours to finish the analysis of a 300 TB dataset of whole genome sequences from 2,000 human beings, which can take as long as 8 months on a commodity server. The speedup is about 700x.
全基因组重测序在生物医学研究中起着至关重要的作用。基因组大数据的出现需要巨大的计算能力。然而,目前的计算方法在利用可用的计算资源方面效率低下。在本文中,我们通过优化世界上最快的超级计算机- TH-2超级计算机的利用率来解决这一挑战。TH-2的特点是采用新异构架构,每个计算节点配备2个Intel Xeon cpu和3个Intel Xeon Phi协处理器。TH-2基因组分析软件管线的部署面临着异质性和海量数据处理的巨大挑战。运行时分析显示,在典型的基因组分析管道中,SOAP3-dp和SOAPsnp是最耗时的组件(占总运行时的70%)。为了优化整个管道,我们首先为SOAP3-dp和SOAPsnp设计了许多并行和优化策略,分别针对每个节点充分利用CPU和MIC提供的各种硬件资源。我们还采用了一些扩展方法来减少不同节点之间的通信。然后我们在TH-2上扩展了我们的方法。在8192个节点的情况下,整个分析过程需要8.37小时才能完成对来自2000人的300 TB全基因组序列数据集的分析,在商用服务器上可能需要8个月的时间。加速大约是700倍。
{"title":"The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer","authors":"Shaoliang Peng, Xiangke Liao, Canqun Yang, Yutong Lu, Jie Liu, Yingbo Cui, Heng Wang, Chengkun Wu, Bingqiang Wang","doi":"10.1109/CCGrid.2015.46","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.46","url":null,"abstract":"Whole genome re-sequencing plays a crucial role in biomedical studies. The emergence of genomic big data calls for an enormous amount of computing power. However, current computational methods are inefficient in utilizing available computational resources. In this paper, we address this challenge by optimizing the utilization of the fastest supercomputer in the world - TH-2 supercomputer. TH-2 is featured by its neo-heterogeneous architecture, in which each compute node is equipped with 2 Intel Xeon CPUs and 3 Intel Xeon Phi coprocessors. The heterogeneity and the massive amount of data to be processed pose great challenges for the deployment of the genome analysis software pipeline on TH-2. Runtime profiling shows that SOAP3-dp and SOAPsnp are the most time-consuming components (up to 70% of total runtime) in a typical genome-analyzing pipeline. To optimize the whole pipeline, we first devise a number of parallel and optimization strategies for SOAP3-dp and SOAPsnp, respectively targeting each node to fully utilize all sorts of hardware resources provided both by CPU and MIC. We also employ a few scaling methods to reduce communication between different nodes. We then scaled up our method on TH-2. With 8192 nodes, the whole analyzing procedure took 8.37 hours to finish the analysis of a 300 TB dataset of whole genome sequences from 2,000 human beings, which can take as long as 8 months on a commodity server. The speedup is about 700x.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"8 1","pages":"823-828"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78334165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1