Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.最新文献

英文中文

An approach for automatic data virtualization 一种自动数据虚拟化的方法

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 2004-06-04 DOI: 10.1109/HPDC.2004.2

L. Weng, G. Agrawal, Ümit V. Çatalyürek, T. Kurç, S. Narayanan, J. Saltz

Analysis of large and/or geographically distributed scientific datasets is emerging as a key component of grid computing. One challenge in this area is that scientific datasets are typically stored as binary or character flat-files, which makes specification of processing much harder. In view of this, there has been recent interest in data virtualization, and data services to support such virtualization. This paper presents an approach for automatically creating data services to support data virtualization. Specifically, we show how a relational table like data abstraction can be supported for complex multidimensional scientific datasets that are resident on a cluster. We have designed and implemented a tool that processes SQL queries (with select and where statements) on multi-dimensional datasets. We have designed a meta-data description language that is used for specifying the data layout. From such description, our tool automatically generates efficient data subsetting and access functions. We have extensively evaluated our system. The key observations from our experiments are as follows. First, our tool can correctly and efficiently handle a variety of different data layouts. Second, our system scales well as the number of nodes or the amount of data is scaled. Third, the performance of the automatically generated code for indexing and contracting functions is quite comparable to the performance of hand-written codes.

分析大型和/或地理上分布的科学数据集正在成为网格计算的关键组成部分。这一领域的一个挑战是，科学数据集通常以二进制或字符平面文件的形式存储，这使得规范处理变得更加困难。鉴于此，最近出现了对数据虚拟化和支持这种虚拟化的数据服务的兴趣。本文提出了一种自动创建数据服务以支持数据虚拟化的方法。具体来说，我们将展示如何为驻留在集群上的复杂多维科学数据集支持像数据抽象这样的关系表。我们设计并实现了一个工具来处理多维数据集上的SQL查询(使用select和where语句)。我们设计了一种元数据描述语言，用于指定数据布局。根据这样的描述，我们的工具自动生成高效的数据子集和访问函数。我们对我们的系统进行了广泛的评估。我们实验的主要观察结果如下。首先，我们的工具可以正确有效地处理各种不同的数据布局。其次，我们的系统可以很好地扩展节点数量或数据量。第三，用于索引和收缩函数的自动生成代码的性能与手写代码的性能相当。

{"title":"An approach for automatic data virtualization","authors":"L. Weng, G. Agrawal, Ümit V. Çatalyürek, T. Kurç, S. Narayanan, J. Saltz","doi":"10.1109/HPDC.2004.2","DOIUrl":"https://doi.org/10.1109/HPDC.2004.2","url":null,"abstract":"Analysis of large and/or geographically distributed scientific datasets is emerging as a key component of grid computing. One challenge in this area is that scientific datasets are typically stored as binary or character flat-files, which makes specification of processing much harder. In view of this, there has been recent interest in data virtualization, and data services to support such virtualization. This paper presents an approach for automatically creating data services to support data virtualization. Specifically, we show how a relational table like data abstraction can be supported for complex multidimensional scientific datasets that are resident on a cluster. We have designed and implemented a tool that processes SQL queries (with select and where statements) on multi-dimensional datasets. We have designed a meta-data description language that is used for specifying the data layout. From such description, our tool automatically generates efficient data subsetting and access functions. We have extensively evaluated our system. The key observations from our experiments are as follows. First, our tool can correctly and efficiently handle a variety of different data layouts. Second, our system scales well as the number of nodes or the amount of data is scaled. Third, the performance of the automatically generated code for indexing and contracting functions is quite comparable to the performance of hand-written codes.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124466466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Strategies for using additional resources in parallel hash-based join algorithms 在基于哈希的并行连接算法中使用额外资源的策略

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 2004-06-04 DOI: 10.1109/HPDC.2004.34

Xi Zhang, T. Kurç, T. Pan, Ümit V. Çatalyürek, S. Narayanan, P. Wyckoff, J. Saltz

Hash-based join is a compute- and memory-intensive algorithm. It achieves good performance and scales well to large datasets, if sufficient memory is available to hold the hash table and the distribution of computing had across nodes is balanced. We compare three adaptive algorithms that start with a partitioning of the hash table across a group of nodes and expand during the hash table building phase to additional resources, when memory on a node is used up. The split-based algorithm partitions the hash table range assigned to the node, on which memory is full, into two segments and assigns one of the segments to a new node in the system. The replication-based algorithm replicates the hash table range on a new node. The hybrid algorithm combines the first and second strategies in order to address each strategy's short comings. We perform an experimental performance evaluation of these algorithms on a PC cluster. Our results show that among the three algorithms, in most cases the hybrid algorithm either performs close to the better of the two or is the best algorithm.

基于哈希的连接是一种计算和内存密集型算法。如果有足够的内存来保存哈希表，并且跨节点的计算分布是平衡的，那么它可以实现良好的性能并很好地扩展到大型数据集。我们比较了三种自适应算法，它们从跨一组节点对哈希表进行分区开始，并在哈希表构建阶段扩展到节点上的内存用尽时使用额外的资源。基于分割的算法将分配给内存满的节点的哈希表范围划分为两个段，并将其中一个段分配给系统中的新节点。基于复制的算法在新节点上复制哈希表范围。混合算法将第一种策略和第二种策略结合起来，以解决每种策略的缺点。我们在PC集群上对这些算法进行了实验性能评估。结果表明，在三种算法中，大多数情况下混合算法的性能接近于两者中的较好算法，或者是最好的算法。

{"title":"Strategies for using additional resources in parallel hash-based join algorithms","authors":"Xi Zhang, T. Kurç, T. Pan, Ümit V. Çatalyürek, S. Narayanan, P. Wyckoff, J. Saltz","doi":"10.1109/HPDC.2004.34","DOIUrl":"https://doi.org/10.1109/HPDC.2004.34","url":null,"abstract":"Hash-based join is a compute- and memory-intensive algorithm. It achieves good performance and scales well to large datasets, if sufficient memory is available to hold the hash table and the distribution of computing had across nodes is balanced. We compare three adaptive algorithms that start with a partitioning of the hash table across a group of nodes and expand during the hash table building phase to additional resources, when memory on a node is used up. The split-based algorithm partitions the hash table range assigned to the node, on which memory is full, into two segments and assigns one of the segments to a new node in the system. The replication-based algorithm replicates the hash table range on a new node. The hybrid algorithm combines the first and second strategies in order to address each strategy's short comings. We perform an experimental performance evaluation of these algorithms on a PC cluster. Our results show that among the three algorithms, in most cases the hybrid algorithm either performs close to the better of the two or is the best algorithm.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127978298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Discouraging free riding in a peer-to-peer CPU-sharing grid 不鼓励在点对点cpu共享网格中搭便车

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 2004-06-04 DOI: 10.1109/HPDC.2004.9

N. Andrade, F. Brasileiro, W. Cirne, M. Mowbray

Grid computing has excited many with the promise of access to huge amounts of resources distributed across the globe. However, there are no largely adopted solutions for automatically assembling grids, and this limits the scale of today's grids. Some argue that this is due to the overwhelming complexity of the proposed economy-based solutions. Peer-to-peer grids Iwve emerged as a less complex alternative. We are currently deploying OurGrid, one such peer-to-peer grid. OurGrid is a CPU-sharing grid that targets bag-of-tasks applications (i.e. parallel applications whose tasks are independent). In order to ease system deployment, OurGrid is based on a very lightweight autonomous reputation scheme. Free riding is an important issue for any peer-to-peer system. The aim is to show that OurGrid's reputation system successfully discourages free riding, making it in each peer s own interest to collaborate with the peer-to-peer community. We show this in two steps. First, we analyze the conditions under which a reputation scheme can discourage free riding in a CPU-sharing grid. Second, we show that OurGrid's reputation scheme satisfies these conditions, even in the presence of malicious peers. Unlike other distributed mechanisms for discouraging free riding, OurGrid's reputation scheme achieves this without requiring a shared cryptographic infrastructure or specialized storage.

网格计算使许多人兴奋不已，因为它有望访问分布在全球的大量资源。然而，目前还没有广泛采用的自动组装网格的解决方案，这限制了当今网格的规模。一些人认为，这是由于拟议的基于经济的解决方案过于复杂。点对点网格作为一种不那么复杂的替代方案出现了。我们目前正在部署OurGrid，一个这样的点对点网格。OurGrid是一个cpu共享网格，目标是任务包应用程序(即任务独立的并行应用程序)。为了简化系统部署，OurGrid基于一个非常轻量级的自治信誉方案。免费搭乘对于任何点对点系统来说都是一个重要问题。其目的是表明OurGrid的声誉系统成功地阻止了搭便车行为，使每个人都有兴趣与点对点社区合作。我们分两步来演示。首先，我们分析了信誉方案在cpu共享网格中阻止搭便车的条件。其次，我们证明了OurGrid的信誉方案满足这些条件，即使在恶意对等体存在的情况下也是如此。与其他阻止搭便车的分布式机制不同，OurGrid的信誉方案不需要共享的加密基础设施或专门的存储来实现这一点。

{"title":"Discouraging free riding in a peer-to-peer CPU-sharing grid","authors":"N. Andrade, F. Brasileiro, W. Cirne, M. Mowbray","doi":"10.1109/HPDC.2004.9","DOIUrl":"https://doi.org/10.1109/HPDC.2004.9","url":null,"abstract":"Grid computing has excited many with the promise of access to huge amounts of resources distributed across the globe. However, there are no largely adopted solutions for automatically assembling grids, and this limits the scale of today's grids. Some argue that this is due to the overwhelming complexity of the proposed economy-based solutions. Peer-to-peer grids Iwve emerged as a less complex alternative. We are currently deploying OurGrid, one such peer-to-peer grid. OurGrid is a CPU-sharing grid that targets bag-of-tasks applications (i.e. parallel applications whose tasks are independent). In order to ease system deployment, OurGrid is based on a very lightweight autonomous reputation scheme. Free riding is an important issue for any peer-to-peer system. The aim is to show that OurGrid's reputation system successfully discourages free riding, making it in each peer s own interest to collaborate with the peer-to-peer community. We show this in two steps. First, we analyze the conditions under which a reputation scheme can discourage free riding in a CPU-sharing grid. Second, we show that OurGrid's reputation scheme satisfies these conditions, even in the presence of malicious peers. Unlike other distributed mechanisms for discouraging free riding, OurGrid's reputation scheme achieves this without requiring a shared cryptographic infrastructure or specialized storage.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132736264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 102

WS-ResourceFramework on .NET .NET上的WS-ResourceFramework

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 2004-06-04 DOI: 10.1109/HPDC.2004.42

G. Wasson, N. Beekwilder, M. Morgan, M. Humphrey

The WSRF specifications [Foster, I. et al., (2004)] represent the merging of "the Web " and "the grid". This poster describes a design to achieve compliance with the WS-ResourceFramework specifications using Microsoft .NET technologies. Our design seeks to leverage Microsoft tools wherever possible and to make WSRF compliant services easy to program. While our work on OGSI.NET [Wasson, G. et al., (2004)] provides invaluable insight that guides the design of WSRF.NET, we feel that a different set of abstractions are necessary to capture the full potential of the WS-ResourceFramework. This poster describes our work to date on WSRF.NET The poster discusses topics such as the implementation of WS-Resources, the WSRF.NET programming model, our security architecture and our future release plans (including our first release at HPDC 13).

WSRF规范[Foster, I. et al.，(2004)]代表了“Web”和“网格”的融合。这张海报描述了使用Microsoft . net技术实现WS-ResourceFramework规范遵从性的设计。我们的设计力求尽可能地利用Microsoft工具，并使符合WSRF的服务易于编程。而我们在OGSI上的工作。. NET [Wasson, G. et al.，(2004)]提供了指导WSRF设计的宝贵见解。NET中，我们认为需要一组不同的抽象来充分发挥WS-ResourceFramework的潜力。这张海报描述了我们迄今为止在WSRF上的工作。海报讨论了诸如WS-Resources的实现、WSRF等主题。. NET编程模型，我们的安全架构和我们未来的发布计划(包括我们在HPDC 13上的第一个发布)。

引用次数: 5

Achieving performance consistency in heterogeneous clusters 实现异构集群的性能一致性

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 2004-06-04 DOI: 10.1109/HPDC.2004.1

Changxun Wu, R. Burns

Hash-based randomization is a powerful technique used in clusters and distributed systems for load management. It offers uniform distribution, efficient addressing, little shared state, and scalability. However, simple hash-based randomization is unable to deal with skew and heterogeneity and, therefore, cannot achieve load balance in many environments. Virtual processors have been proposed as a solution to simple randomization's problem. We evaluate an alternative load management scheme for heterogeneous, shared-disk clusters. Our scheme directly tunes hash-based randomized load placement using a technique called adaptive, nonuniform (ANU) randomization [2003] and compares favorably to the virtual processor approach. It provides the load balancing benefits of virtual processors with less shared state. It also automatically adapts to workload and cluster configuration changes, such as failure and recovery and adding or removing servers, without human involvement. Experimental results show that our scheme outperforms virtual processors and performs comparably to prescient load-balancing algorithms. They also show that our system maintains consistent performance across all servers while moving a minimal amount of load.

基于哈希的随机化是集群和分布式系统中用于负载管理的强大技术。它提供了统一的分布、高效的寻址、很少的共享状态和可伸缩性。然而，简单的基于哈希的随机化无法处理倾斜和异质性，因此在许多环境中无法实现负载平衡。虚拟处理器是解决简单随机化问题的一种方法。我们评估了异构、共享磁盘集群的另一种负载管理方案。我们的方案使用一种称为自适应非均匀(ANU)随机化的技术[2003]直接调整基于哈希的随机负载放置，并且比虚拟处理器方法更有利。它提供了共享状态较少的虚拟处理器的负载平衡优势。它还自动适应工作负载和集群配置更改，例如故障和恢复以及添加或删除服务器，而无需人工参与。实验结果表明，该方案优于虚拟处理器，并可与现有的负载均衡算法相媲美。它们还表明，我们的系统在移动最小负载的同时，在所有服务器上保持一致的性能。

{"title":"Achieving performance consistency in heterogeneous clusters","authors":"Changxun Wu, R. Burns","doi":"10.1109/HPDC.2004.1","DOIUrl":"https://doi.org/10.1109/HPDC.2004.1","url":null,"abstract":"Hash-based randomization is a powerful technique used in clusters and distributed systems for load management. It offers uniform distribution, efficient addressing, little shared state, and scalability. However, simple hash-based randomization is unable to deal with skew and heterogeneity and, therefore, cannot achieve load balance in many environments. Virtual processors have been proposed as a solution to simple randomization's problem. We evaluate an alternative load management scheme for heterogeneous, shared-disk clusters. Our scheme directly tunes hash-based randomized load placement using a technique called adaptive, nonuniform (ANU) randomization [2003] and compares favorably to the virtual processor approach. It provides the load balancing benefits of virtual processors with less shared state. It also automatically adapts to workload and cluster configuration changes, such as failure and recovery and adding or removing servers, without human involvement. Experimental results show that our scheme outperforms virtual processors and performs comparably to prescient load-balancing algorithms. They also show that our system maintains consistent performance across all servers while moving a minimal amount of load.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116911235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

SPURport SPURport

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

Pub Date : 1900-01-01 DOI: 10.1109/HPDC.2004.33

T. Haupt, Anand Kalyanasundaram, Nisreen Ammari, Archana Chilukuri, Maxim Khotournenko

The poster presents a successful implementation of the SPURport - a prototype Grid Portal for the earthquake engineering community. Developed as a pert of the SPUR project, it extends functionality of the NEESgrid, which in turn, is an application of OGSI/Globus 3.0. We found that the implementation of a Grid portal is much easier when one introduces high-level middle-tier services that aggregate and coordinate lower-level services provided by the Globus toolkit. For example, our high level job submission service orchestrates resolution of logical entities to physical ones, file transfers, and data streaming prior to actual the resources allocation. We found it very useful to employ application descriptors that facilitate automatic generation of RSL documents.

这张海报展示了一个成功实现的SPURport——一个用于地震工程社区的原型网格门户。作为SPUR项目的一部分，它扩展了NEESgrid的功能，而NEESgrid又是OGSI/Globus 3.0的一个应用程序。我们发现，当引入高级中间层服务(这些服务聚合和协调Globus工具包提供的低级服务)时，网格门户的实现要容易得多。例如，我们的高级作业提交服务在实际资源分配之前编排了逻辑实体到物理实体的解析、文件传输和数据流。我们发现使用有助于自动生成RSL文档的应用程序描述符非常有用。

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀