首页 > 最新文献

2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid最新文献

英文 中文
PAIS: A Proximity-Aware Interest-Clustered P2P File Sharing System 一个邻近感知的兴趣集群P2P文件共享系统
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.17
Haiying Shen
Efficient file query is important to the overall performance of Peer-to-Peer (P2P) file sharing systems. Clustering peers by their common interests can significantly enhance the efficiency of file query. On the other hand, clustering peers by their physical proximity can also improve file query performance. Few current works are able to cluster peers based on both peer interest and physical proximity. It is even harder to realize it in structured P2Ps due to their strictly defined topologies, although they provide higher file query efficiency than unstructured P2Ps. In this paper, we introduce a proximity-aware and interest-clustered P2P file sharing system (PAIS) based on a structured P2P. It groups peers based on both interest and proximity. PAIS supports sophisticated routing and clustering strategies based on a hierarchical topology. Theoretical analysis and simulation results demonstrate that PAIS dramatically reduces the overhead and enhances efficiency in file sharing.
高效的文件查询对P2P文件共享系统的整体性能至关重要。根据共同兴趣对节点进行聚类可以显著提高文件查询的效率。另一方面,通过物理接近度对对等点进行聚类也可以提高文件查询性能。目前很少有研究能够基于同伴的兴趣和物理距离来聚类同伴。尽管结构化p2p提供了比非结构化p2p更高的文件查询效率,但由于其严格定义的拓扑结构,在结构化p2p中实现它甚至更加困难。本文介绍了一种基于结构化P2P的邻近感知和兴趣集群P2P文件共享系统。它根据兴趣和邻近程度对同伴进行分组。PAIS支持基于分层拓扑的复杂路由和集群策略。理论分析和仿真结果表明,PAIS系统显著降低了系统开销,提高了文件共享效率。
{"title":"PAIS: A Proximity-Aware Interest-Clustered P2P File Sharing System","authors":"Haiying Shen","doi":"10.1109/CCGRID.2009.17","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.17","url":null,"abstract":"Efficient file query is important to the overall performance of Peer-to-Peer (P2P) file sharing systems. Clustering peers by their common interests can significantly enhance the efficiency of file query. On the other hand, clustering peers by their physical proximity can also improve file query performance. Few current works are able to cluster peers based on both peer interest and physical proximity. It is even harder to realize it in structured P2Ps due to their strictly defined topologies, although they provide higher file query efficiency than unstructured P2Ps. In this paper, we introduce a proximity-aware and interest-clustered P2P file sharing system (PAIS) based on a structured P2P. It groups peers based on both interest and proximity. PAIS supports sophisticated routing and clustering strategies based on a hierarchical topology. Theoretical analysis and simulation results demonstrate that PAIS dramatically reduces the overhead and enhances efficiency in file sharing.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"418 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122862105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Eucalyptus Open-Source Cloud-Computing System Eucalyptus开源云计算系统
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.93
Daniel Nurmi, R. Wolski, Chris Grzegorczyk, Graziano Obertelli, Sunil Soman, Lamia Youseff, D. Zagorodnov
Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems.  These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years.  However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the Eucalyptus design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings.  Finally, we provide evidence that Eucalyptus enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware.
云计算系统从根本上通过各种接口提供对大型数据池和计算资源的访问,这些接口在精神上类似于现有的网格和HPC资源管理和编程系统。这些类型的系统为可伸缩的应用程序开发人员提供了新的编程目标,并且在过去几年中得到了普及。然而,目前运行的大多数云计算系统都是专有的,依赖于对研究社区不可见的基础设施,或者没有明确设计供系统研究人员使用和修改。在这项工作中,我们介绍了Eucalyptus——一个用于云计算的开源软件框架,它实现了通常被称为基础设施即服务(IaaS)的东西;使用户能够运行和控制部署在各种物理资源上的整个虚拟机实例的系统。我们概述了Eucalyptus设计的基本原则,详细介绍了系统的重要操作方面,并讨论了我们所做的架构权衡,以允许Eucalyptus在学术环境中常见的基础设施上可移植、模块化和简单使用。最后,我们提供的证据表明,Eucalyptus使熟悉现有网格和HPC系统的用户能够探索新的云计算功能,同时保持对现有的、熟悉的应用程序开发软件和网格中间件的访问。
{"title":"The Eucalyptus Open-Source Cloud-Computing System","authors":"Daniel Nurmi, R. Wolski, Chris Grzegorczyk, Graziano Obertelli, Sunil Soman, Lamia Youseff, D. Zagorodnov","doi":"10.1109/CCGRID.2009.93","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.93","url":null,"abstract":"Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems.  These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years.  However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the Eucalyptus design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings.  Finally, we provide evidence that Eucalyptus enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133979204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2069
Collusion Detection for Grid Computing 网格计算中的合谋检测
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.12
Eugen Staab, T. Engel
A common technique for result verification in grid computing is to delegate a computation redundantly to different workers and apply majority voting to the returned results. However, the technique is sensitive to "collusion" where a majority of malicious workers collectively returns the same incorrect result. In this paper, we propose a mechanism that identifies groups of colluding workers. The mechanism is based on the fact that colluders can succeed in a vote only when they hold the majority. This information allows us to build clusters of workers that voted similarly in the past, and so detect collusion. We find that the more strongly workers collude, the better they can be identified.
网格计算结果验证的一种常用技术是将计算冗余地委托给不同的工作人员,并对返回的结果应用多数投票。然而,该技术对“共谋”很敏感,即大多数恶意工作人员集体返回相同的错误结果。在本文中,我们提出了一种识别串通工人群体的机制。该机制是基于这样一个事实,即共谋者只有在拥有多数席位时才能在投票中成功。这些信息使我们能够建立过去投票相似的工人集群,从而发现共谋。我们发现,员工之间的勾结越强烈,就越容易被识别出来。
{"title":"Collusion Detection for Grid Computing","authors":"Eugen Staab, T. Engel","doi":"10.1109/CCGRID.2009.12","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.12","url":null,"abstract":"A common technique for result verification in grid computing is to delegate a computation redundantly to different workers and apply majority voting to the returned results. However, the technique is sensitive to \"collusion\" where a majority of malicious workers collectively returns the same incorrect result. In this paper, we propose a mechanism that identifies groups of colluding workers. The mechanism is based on the fact that colluders can succeed in a vote only when they hold the majority. This information allows us to build clusters of workers that voted similarly in the past, and so detect collusion. We find that the more strongly workers collude, the better they can be identified.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114874514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid 用模板预测网格中科学工作流应用程序的执行时间
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.77
F. Nadeem, T. Fahringer
Workflow execution time predictions for Grid infrastructures is of critical importance for optimized workflow executions, advance reservations of resources, and overhead analysis. Predicting workflow execution time is complex due to multeity of workflow structures, involvement of several Grid resources in workflow execution, complex dependencies of workflow activities and dynamic behavior of the Grid. In this paper we present an online workflow execution time prediction system exploiting similarity templates. The workflows are characterized considering the attributes describing their performance at different Grid infrastructural levels. A “supervised exhaustive search” is employed to find suitable templates. We also make a provision of including expert user knowledge about the workflow performance in the procession of our methods. Results for three real world applications are presented to show the effectiveness of our approach.
网格基础设施的工作流执行时间预测对于优化工作流执行、提前预留资源和开销分析至关重要。由于工作流结构的多样性、工作流执行中涉及多个网格资源、工作流活动的复杂依赖关系以及网格的动态行为,预测工作流执行时间非常复杂。本文提出了一种利用相似模板的在线工作流执行时间预测系统。工作流的特征考虑了描述工作流在不同网格基础结构级别上的性能的属性。采用“监督式穷举搜索”来寻找合适的模板。我们还提供了在我们的方法过程中包含有关工作流性能的专家用户知识。三个实际应用的结果显示了我们的方法的有效性。
{"title":"Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid","authors":"F. Nadeem, T. Fahringer","doi":"10.1109/CCGRID.2009.77","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.77","url":null,"abstract":"Workflow execution time predictions for Grid infrastructures is of critical importance for optimized workflow executions, advance reservations of resources, and overhead analysis. Predicting workflow execution time is complex due to multeity of workflow structures, involvement of several Grid resources in workflow execution, complex dependencies of workflow activities and dynamic behavior of the Grid. In this paper we present an online workflow execution time prediction system exploiting similarity templates. The workflows are characterized considering the attributes describing their performance at different Grid infrastructural levels. A “supervised exhaustive search” is employed to find suitable templates. We also make a provision of including expert user knowledge about the workflow performance in the procession of our methods. Results for three real world applications are presented to show the effectiveness of our approach.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117311507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Resource Allocation Using Virtual Clusters 使用虚拟集群分配资源
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.23
Mark Stillwell, D. Schanzenbach, F. Vivien, H. Casanova
We propose a novel approach for sharing cluster resources among competing jobs. The key advantage of our approach over current solutions is that it increases cluster utilization while optimizing a user-centric metric that captures both notions of performance and fairness. We motivate and formalize the corresponding resource allocation problem, determine its complexity, and propose several algorithms to solve it in the case of a static workload that consists of sequential jobs. Via extensive simulation experiments we identify an algorithm that runs quickly, that is always on par with or better than its competitors, and that produces resource allocations that are close to optimal. We find that the extension of our approach to parallel jobs leads to similarly good results. Finally, we explain how to extend our work to dynamicworkloads.
我们提出了一种在竞争作业之间共享集群资源的新方法。与当前解决方案相比,我们的方法的关键优势在于,它提高了集群利用率,同时优化了以用户为中心的指标,该指标同时捕获了性能和公平性的概念。我们激发并形式化了相应的资源分配问题,确定了其复杂性,并提出了几种算法来解决由顺序作业组成的静态工作负载的问题。通过广泛的模拟实验,我们确定了一种快速运行的算法,它总是与竞争对手持平或更好,并且产生接近最佳的资源分配。我们发现,将我们的方法扩展到并行工作也会产生同样好的结果。最后,我们将解释如何将我们的工作扩展到动态工作负载。
{"title":"Resource Allocation Using Virtual Clusters","authors":"Mark Stillwell, D. Schanzenbach, F. Vivien, H. Casanova","doi":"10.1109/CCGRID.2009.23","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.23","url":null,"abstract":"We propose a novel approach for sharing cluster resources among competing jobs. The key advantage of our approach over current solutions is that it increases cluster utilization while optimizing a user-centric metric that captures both notions of performance and fairness. We motivate and formalize the corresponding resource allocation problem, determine its complexity, and propose several algorithms to solve it in the case of a static workload that consists of sequential jobs. Via extensive simulation experiments we identify an algorithm that runs quickly, that is always on par with or better than its competitors, and that produces resource allocations that are close to optimal. We find that the extension of our approach to parallel jobs leads to similarly good results. Finally, we explain how to extend our work to dynamicworkloads.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 86
Efficient Grid Task-Bundle Allocation Using Bargaining Based Self-Adaptive Auction 基于议价自适应拍卖的高效网格任务包分配
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.86
Han Zhao, Xiaolin Li
To address coordination and complexity issues, we formulate a grid task allocation problem as a bargaining based self-adaptive auction and propose the BarSAA grid task-bundle allocation algorithm. During the auction, prices are iteratively negotiated and dynamically adjusted until market equilibrium is reached. The BarSAA algorithm features decentralized bidding decision making in a heterogeneous distributed environment so that scheduler can offload its duty onto participating computing nodes and significantly reduces scheduling overheads. When a BarSAA auction converges, the equilibrium point is {Pareto Optimal} and achieves social efficient outcome and double-sided revenue maximization. In addition, BarSAA promotes truthful behavior among selfish nodes. Through game theoretical analysis, we demonstrate that truthful revelation is beneficial to bidders in making bidding strategies. Extensive simulation results are presented to demonstrate the efficiency of the BarSAA strategy and validate several important analytical properties.
为了解决协调和复杂性问题,我们将网格任务分配问题表述为基于讨价还价的自适应拍卖,并提出了BarSAA网格任务束分配算法。在拍卖过程中,价格反复协商并动态调整,直到达到市场均衡。BarSAA算法的特点是在异构分布式环境中分散投标决策,这样调度程序可以将其任务转移到参与计算节点上,并显着降低调度开销。当BarSAA拍卖收敛时,均衡点为{帕累托最优},实现社会有效结果和双边收益最大化。此外,BarSAA促进了自私节点之间的真实行为。通过博弈论分析,论证了真实披露有利于投标人制定投标策略。大量的仿真结果证明了BarSAA策略的有效性,并验证了几个重要的分析性质。
{"title":"Efficient Grid Task-Bundle Allocation Using Bargaining Based Self-Adaptive Auction","authors":"Han Zhao, Xiaolin Li","doi":"10.1109/CCGRID.2009.86","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.86","url":null,"abstract":"To address coordination and complexity issues, we formulate a grid task allocation problem as a bargaining based self-adaptive auction and propose the BarSAA grid task-bundle allocation algorithm. During the auction, prices are iteratively negotiated and dynamically adjusted until market equilibrium is reached. The BarSAA algorithm features decentralized bidding decision making in a heterogeneous distributed environment so that scheduler can offload its duty onto participating computing nodes and significantly reduces scheduling overheads. When a BarSAA auction converges, the equilibrium point is {Pareto Optimal} and achieves social efficient outcome and double-sided revenue maximization. In addition, BarSAA promotes truthful behavior among selfish nodes. Through game theoretical analysis, we demonstrate that truthful revelation is beneficial to bidders in making bidding strategies. Extensive simulation results are presented to demonstrate the efficiency of the BarSAA strategy and validate several important analytical properties.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125430373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Flexible and Efficient In-Vivo Enhancement for Grid Applications 网格应用灵活高效的活体增强
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.61
Dong Kwan Kim, Yang Jiao, E. Tilevich
In a grid application, some requirements may change while the execution is in progress. This paper presents in-vivo enhancement--updating running grid applications to facilitate their perfective maintenance. Because applications in this domain are not only typically long-running, but also time-consuming to deploy, we propose a dynamic update technique that can change a running application flexibly and efficiently. Specifically, this paper presents a novel technique for dynamically updating grid applications deployed on the Java Virtual Machine (JVM). Our technique overcomes constraints of JVM HotSwap, a facility for replacing classes at runtime. While HotSwap precludes the programmer from adding new methods and fields, changing the signatures of existing methods, and has no support for transferring state between old and new objects, our approach effectively removes these constraints by rewriting program bytecode. Further, the rewritten programs incur only minimal performance overhead (less than 2% on average). We demonstrate the efficiency and extensibility of our approach through micro and macro benchmarks, as well as through a case study of dynamically updating a parallel bioinformatics application.
在网格应用程序中,一些需求可能在执行过程中发生变化。本文提出了活体增强——更新正在运行的网格应用程序,以促进它们的完美维护。由于该领域中的应用程序不仅通常是长时间运行的,而且部署也很耗时,因此我们提出了一种动态更新技术,可以灵活有效地更改正在运行的应用程序。具体来说,本文提出了一种动态更新部署在Java虚拟机(JVM)上的网格应用程序的新技术。我们的技术克服了JVM HotSwap(一种在运行时替换类的工具)的限制。虽然HotSwap阻止程序员添加新方法和字段,改变现有方法的签名,并且不支持在新旧对象之间传输状态,但我们的方法通过重写程序字节码有效地消除了这些限制。此外,重写的程序只产生最小的性能开销(平均不到2%)。我们通过微观和宏观基准,以及动态更新并行生物信息学应用的案例研究,证明了我们的方法的效率和可扩展性。
{"title":"Flexible and Efficient In-Vivo Enhancement for Grid Applications","authors":"Dong Kwan Kim, Yang Jiao, E. Tilevich","doi":"10.1109/CCGRID.2009.61","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.61","url":null,"abstract":"In a grid application, some requirements may change while the execution is in progress. This paper presents in-vivo enhancement--updating running grid applications to facilitate their perfective maintenance. Because applications in this domain are not only typically long-running, but also time-consuming to deploy, we propose a dynamic update technique that can change a running application flexibly and efficiently. Specifically, this paper presents a novel technique for dynamically updating grid applications deployed on the Java Virtual Machine (JVM). Our technique overcomes constraints of JVM HotSwap, a facility for replacing classes at runtime. While HotSwap precludes the programmer from adding new methods and fields, changing the signatures of existing methods, and has no support for transferring state between old and new objects, our approach effectively removes these constraints by rewriting program bytecode. Further, the rewritten programs incur only minimal performance overhead (less than 2% on average). We demonstrate the efficiency and extensibility of our approach through micro and macro benchmarks, as well as through a case study of dynamically updating a parallel bioinformatics application.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126071526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster 在多核集群上并行处理数据密集型应用程序的性能问题
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.83
Vignesh T. Ravi, G. Agrawal
The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables  parallel execution of data mining  and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and  2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.
可供分析的大量可用数据要求对数据挖掘实现的性能进行扩展。根据当前的架构趋势,当今的主要挑战之一是在多核机器和多核机器集群上实现数据挖掘应用程序的可编程性和性能。为了解决这个问题,我们一直在开发一个运行时框架FREERIDE,它支持并行执行数据挖掘和数据分析任务。本文的贡献有两个方面:1)本文描述并评估了在我们的多核集群运行时系统中开发的各种共享内存并行化技术,2)我们报告了详细的性能研究,以了解为什么某些并行化技术在特定应用程序中优于其他技术。
{"title":"Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster","authors":"Vignesh T. Ravi, G. Agrawal","doi":"10.1109/CCGRID.2009.83","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.83","url":null,"abstract":"The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables  parallel execution of data mining  and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and  2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131350824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
BloomCast: Efficient Full-Text Retrieval over Unstructured P2Ps with Guaranteed Recall 高效全文检索与保证召回的非结构化p2p
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.50
Hanhua Chen, Hai Jin, Xucheng Luo, Yunhao Liu, L. Ni
Efficient and effective full-text retrieval in unstructured peer-to-peer networks remains a challenge in the research community. First, it is difficult, if not impossible, for unstructured P2P search protocols to effectively locate items with guaranteed recall rate. Second, existing schemes to improve search successful rate often rely on replicating a large number of item replicas across the wide area network, incurring a large amount of communication and storage cost. In this paper we propose BloomCast, an efficient and effective full-text retrieval scheme, in unstructured P2P networks. BloomCast is effective because it guarantees perfect recall rate with high probability. It is efficient because the overall communication cost of full-text search is reduced below a formal bound. Furthermore, by casting Bloom Filters instead of the raw documents across the network, BloomCast significantly reduces the communication cost and storage cost for replication. We demonstrate the power of BloomCast design through both mathematical proof and comprehensive simulations. Results show that BloomCast outperforms existing schemes in terms of both recall rate and communication cost.
在非结构化点对点网络中实现高效的全文检索一直是研究领域面临的挑战。首先,非结构化P2P搜索协议很难(如果不是不可能的话)有效地定位具有保证召回率的项目。其次,现有的提高搜索成功率的方案往往依赖于在广域网中复制大量的条目副本,这会产生大量的通信和存储成本。本文提出了一种在非结构化P2P网络中高效的全文检索方案BloomCast。BloomCast是有效的,因为它保证了高概率的完美召回率。它是高效的,因为全文搜索的总体通信成本降低到一个正式的界限以下。此外,通过在网络上使用Bloom Filters而不是原始文档,BloomCast显著降低了复制的通信成本和存储成本。我们通过数学证明和综合模拟来展示BloomCast设计的强大功能。结果表明,该方法在召回率和通信成本方面都优于现有方案。
{"title":"BloomCast: Efficient Full-Text Retrieval over Unstructured P2Ps with Guaranteed Recall","authors":"Hanhua Chen, Hai Jin, Xucheng Luo, Yunhao Liu, L. Ni","doi":"10.1109/CCGRID.2009.50","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.50","url":null,"abstract":"Efficient and effective full-text retrieval in unstructured peer-to-peer networks remains a challenge in the research community. First, it is difficult, if not impossible, for unstructured P2P search protocols to effectively locate items with guaranteed recall rate. Second, existing schemes to improve search successful rate often rely on replicating a large number of item replicas across the wide area network, incurring a large amount of communication and storage cost. In this paper we propose BloomCast, an efficient and effective full-text retrieval scheme, in unstructured P2P networks. BloomCast is effective because it guarantees perfect recall rate with high probability. It is efficient because the overall communication cost of full-text search is reduced below a formal bound. Furthermore, by casting Bloom Filters instead of the raw documents across the network, BloomCast significantly reduces the communication cost and storage cost for replication. We demonstrate the power of BloomCast design through both mathematical proof and comprehensive simulations. Results show that BloomCast outperforms existing schemes in terms of both recall rate and communication cost.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127214586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Modeling Job Arrival Process with Long Range Dependence and Burstiness Characteristics 具有长距离依赖性和突发性特征的工作到达过程建模
Pub Date : 2009-05-18 DOI: 10.1109/CCGRID.2009.35
T. Minh, L. Wolters
Workload modeling plays a significant role in performance evaluation of large-scale parallel systems such as clusters and grids. It helps to generate synthetic workloads which capture some dominant characteristics of traces (real workloads). Modeling job arrival process is an essential part of workload modeling. Although a job arrival process has many important characteristics such as long range dependence (LRD) and burstiness, most researchers, for simplicity, assume it as a poisson process in their evaluation work. Furthermore, there is currently almost no research focusing on both LRD and burstiness at the same time according to our investigation. With respect to this research trend, the multifractal wavelet model (MWM) recently has been introduced as a good choice to yield LRD for a job arrival process. Though LRD is well controlled, we observe that a job arrival process produced by MWM does not keep burstiness. In this paper, we present our study on modifying MWM so that not only LRD but also burstiness are kept in the job arrival process. In addition, our modification also fits the marginal distribution better than MWM.
负载建模在集群和网格等大规模并行系统的性能评估中起着重要的作用。它有助于生成合成的工作负载,这些工作负载捕获轨迹的一些主要特征(实际工作负载)。作业到达过程建模是工作量建模的重要组成部分。虽然工作到达过程具有长距离依赖性和突发性等重要特征,但为了简单起见,大多数研究者在评价工作中都将其假设为泊松过程。此外,根据我们的调查,目前几乎没有同时关注LRD和爆发的研究。针对这一研究趋势,最近引入了多重分形小波模型(MWM)作为作业到达过程产生LRD的良好选择。虽然LRD得到了很好的控制,但我们观察到MWM产生的作业到达过程并不保持突发性。在本文中,我们研究了如何修改MWM,使其在作业到达过程中既保持LRD,又保持突发性。此外,我们的修正也比MWM更适合边际分布。
{"title":"Modeling Job Arrival Process with Long Range Dependence and Burstiness Characteristics","authors":"T. Minh, L. Wolters","doi":"10.1109/CCGRID.2009.35","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.35","url":null,"abstract":"Workload modeling plays a significant role in performance evaluation of large-scale parallel systems such as clusters and grids. It helps to generate synthetic workloads which capture some dominant characteristics of traces (real workloads). Modeling job arrival process is an essential part of workload modeling. Although a job arrival process has many important characteristics such as long range dependence (LRD) and burstiness, most researchers, for simplicity, assume it as a poisson process in their evaluation work. Furthermore, there is currently almost no research focusing on both LRD and burstiness at the same time according to our investigation. With respect to this research trend, the multifractal wavelet model (MWM) recently has been introduced as a good choice to yield LRD for a job arrival process. Though LRD is well controlled, we observe that a job arrival process produced by MWM does not keep burstiness. In this paper, we present our study on modifying MWM so that not only LRD but also burstiness are kept in the job arrival process. In addition, our modification also fits the marginal distribution better than MWM.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125971624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1