CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.最新文献

英文中文

A synthesis of parallel out-of-core sorting programs on heterogeneous clusters 异构集群上并行外核排序程序的综合

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199355

C. Cérin, Hazem Fkaier, M. Jemni

The paper considers the problem of parallel external sorting in the context of a form of heterogeneous clusters. We introduce two algorithms and we compare them to another one that we have previously developed. Since most common sort algorithms assume high-speed random access to all intermediate memory, they are unsuitable if the values to be sorted don't fit in main memory. This is the case for cluster computing platforms which are made of standard, cheap and scarce components. For that class of computing resources a good use of I/O operations compatible with the requirements of load balancing and computational complexity are the key to success. We explore three techniques and show how they can be deployed for clusters with processor performances related by a multiplicative factor. We validate the approaches in showing experimental results for the load balancing factor.

本文研究了一类异构集群下的并行外部排序问题。我们介绍了两种算法，并将它们与我们之前开发的另一种算法进行比较。由于大多数常见的排序算法都假定对所有中间内存进行高速随机访问，如果要排序的值不适合主内存，那么它们就不适合。这就是由标准的、廉价的、稀缺的组件组成的集群计算平台的情况。对于这类计算资源，使用与负载平衡和计算复杂性要求兼容的I/O操作是成功的关键。我们将探讨三种技术，并展示如何将它们部署到处理器性能与乘法因子相关的集群中。我们通过展示负载平衡因子的实验结果来验证这些方法。

引用次数: 7

Peer-to-peer keyword search using keyword relationship 使用关键字关系进行点对点关键字搜索

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199388

K. Nakauchi, Y. Ishikawa, H. Morikawa, T. Aoyama

Decentralized and unstructured peer-to-peer (P2P) networks such as Gnutella are attractive for Internet-scale information retrieval and search systems because they require neither any centralized directory nor any centralized management of overlay network topology and data placement. However, due to this decentralized architecture, current P2P keyword search systems lack useful global knowledge such as popularity of data items and relationships between keywords and data items. As a result, current P2P keyword search systems supports only naive text-match search and can find only data items with a keyword (or meta-data) exactly indicated in a query. In this paper, we show an efficient P2P search system which increases possibility of discovering desired data items. The key mechanism is query expansion, where a received query is expanded based on keyword relationships managed in a distributed fashion by participating nodes. Keyword relationships are improved through search and retrieval processes and each relationships is shared among nodes holding similar data items. We also present implementation of our P2P search system.

去中心化和非结构化的点对点(P2P)网络(如Gnutella)对于互联网规模的信息检索和搜索系统很有吸引力，因为它们既不需要任何集中目录，也不需要对覆盖网络拓扑和数据放置进行任何集中管理。然而，由于这种分散的架构，当前的P2P关键字搜索系统缺乏有用的全局知识，例如数据项的流行程度以及关键字与数据项之间的关系。因此，当前的P2P关键字搜索系统只支持简单的文本匹配搜索，并且只能找到在查询中精确指定了关键字(或元数据)的数据项。在本文中，我们展示了一个高效的P2P搜索系统，它增加了发现所需数据项的可能性。关键机制是查询扩展，其中根据参与节点以分布式方式管理的关键字关系扩展接收到的查询。通过搜索和检索过程改进关键字关系，并且每个关系在拥有相似数据项的节点之间共享。我们还介绍了我们的P2P搜索系统的实现。

{"title":"Peer-to-peer keyword search using keyword relationship","authors":"K. Nakauchi, Y. Ishikawa, H. Morikawa, T. Aoyama","doi":"10.1109/CCGRID.2003.1199388","DOIUrl":"https://doi.org/10.1109/CCGRID.2003.1199388","url":null,"abstract":"Decentralized and unstructured peer-to-peer (P2P) networks such as Gnutella are attractive for Internet-scale information retrieval and search systems because they require neither any centralized directory nor any centralized management of overlay network topology and data placement. However, due to this decentralized architecture, current P2P keyword search systems lack useful global knowledge such as popularity of data items and relationships between keywords and data items. As a result, current P2P keyword search systems supports only naive text-match search and can find only data items with a keyword (or meta-data) exactly indicated in a query. In this paper, we show an efficient P2P search system which increases possibility of discovering desired data items. The key mechanism is query expansion, where a received query is expanded based on keyword relationships managed in a distributed fashion by participating nodes. Keyword relationships are improved through search and retrieval processes and each relationships is shared among nodes holding similar data items. We also present implementation of our P2P search system.","PeriodicalId":433323,"journal":{"name":"CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115240984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40

Clustering hosts in P2P and global computing platforms 在P2P和全球计算平台中集群主机

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199389

Abhishek Agrawal, H. Casanova

Being able to identify clusters of nearby hosts among Internet clients provides very useful information for a number of internet and p2p applications. Examples of such applications include web applications, request routing in peer-to-peer overlay network, and distributed computing applications. In this paper, we present and formulate the internet host clustering problem. Leveraging previous work on internet host distance measurement, we propose two hierarchical clustering techniques to solve this problem. The first technique is a marker based hierarchical partitioning approach. The second technique is based on the well known K-means clustering algorithm. We evaluated these two approaches in simulation using a representative Internet topology generated with the GT ITM generator for over 1,000 hosts. Our simulation results demonstrate that our algorithmic clustering approaches effectively identify clusters with arbitrary diameters. Our conclusion is that by leveraging previous work on internet host distance estimation, it is possible to cluster Internet hosts to benefit various applications with various requirements.

能够在Internet客户机中识别附近的主机集群，为许多Internet和p2p应用程序提供了非常有用的信息。这类应用的例子包括web应用、点对点覆盖网络中的请求路由和分布式计算应用。本文提出并提出了internet主机集群问题。利用以前在互联网主机距离测量方面的工作，我们提出了两种分层聚类技术来解决这个问题。第一种技术是基于标记的分层划分方法。第二种技术是基于众所周知的k均值聚类算法。我们在模拟中对这两种方法进行了评估，使用由GT ITM生成器生成的具有代表性的Internet拓扑为1000多台主机进行了模拟。仿真结果表明，我们的聚类算法可以有效地识别任意直径的聚类。我们的结论是，通过利用以前在互联网主机距离估计方面的工作，可以对互联网主机进行集群，以使具有不同需求的各种应用受益。

{"title":"Clustering hosts in P2P and global computing platforms","authors":"Abhishek Agrawal, H. Casanova","doi":"10.1109/CCGRID.2003.1199389","DOIUrl":"https://doi.org/10.1109/CCGRID.2003.1199389","url":null,"abstract":"Being able to identify clusters of nearby hosts among Internet clients provides very useful information for a number of internet and p2p applications. Examples of such applications include web applications, request routing in peer-to-peer overlay network, and distributed computing applications. In this paper, we present and formulate the internet host clustering problem. Leveraging previous work on internet host distance measurement, we propose two hierarchical clustering techniques to solve this problem. The first technique is a marker based hierarchical partitioning approach. The second technique is based on the well known K-means clustering algorithm. We evaluated these two approaches in simulation using a representative Internet topology generated with the GT ITM generator for over 1,000 hosts. Our simulation results demonstrate that our algorithmic clustering approaches effectively identify clusters with arbitrary diameters. Our conclusion is that by leveraging previous work on internet host distance estimation, it is possible to cluster Internet hosts to benefit various applications with various requirements.","PeriodicalId":433323,"journal":{"name":"CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115651425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

DM/sup 2/: a distributed medical data manager for grids DM/sup 2/:用于网格的分布式医疗数据管理器

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199421

H. Duque, J. Montagnat, J. Pierson, L. Brunie, I. Magnin

Medical data represent tremendous amount of data for which automatic analysis is increasingly needed. Grids are very promising to face today's challenging health issues such as epidemiological studies through large image data sets. However, the sensitive nature of medical data makes it difficult to widely distribute medical applications over computational grids. In this paper, we review fundamental medical data manipulation requirements and we propose a distributed data management architecture that addresses the medical data security and high performance constraints. A prototype is currently being developed inside our laboratories to demonstrate the architecture capability to face realistic distributed medical data manipulation situations.

医疗数据代表了大量的数据，越来越需要对这些数据进行自动分析。网格非常有希望面对当今具有挑战性的健康问题，例如通过大型图像数据集进行流行病学研究。然而，医疗数据的敏感性使得在计算网格上广泛分布医疗应用变得困难。在本文中，我们回顾了基本的医疗数据操作需求，并提出了一种分布式数据管理架构，以解决医疗数据安全和高性能约束。我们的实验室目前正在开发一个原型，以演示该架构面对现实的分布式医疗数据操作情况的能力。

引用次数: 13

Performance analysis of parallel I/O scheduling approaches on cluster computing systems 集群计算系统上并行I/O调度方法的性能分析

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199439

J. Abawajy

As computation and communication hardware performance continue to rapidly increase, I/O represents a growing fraction of application execution time. This gap between the I/O subsystem and others is expected to increase in future since I/O performance is limited by physical motion. Therefore, it is imperative that novel techniques for improving I/O performance be developed. Parallel I/O is a promising approach to alleviating this bottleneck. However, very little work exist with respect to scheduling parallel I/O operations explicitly. In this paper, we address the problem of effective management of parallel I/O in cluster computing systems by using appropriate I/O scheduling strategies. We propose two new I/O scheduling algorithms and compare them with two existing scheduling Approaches. The preliminary results show that the proposed policies outperform existing policies substantially.

随着计算和通信硬件性能的持续快速提高，I/O在应用程序执行时间中所占的比例越来越大。由于I/O性能受到物理运动的限制，预计I/O子系统和其他子系统之间的差距将在未来增加。因此，必须开发新的技术来提高I/O性能。并行I/O是缓解这一瓶颈的一种很有前途的方法。然而，关于显式调度并行I/O操作的工作很少。在本文中，我们通过使用适当的I/O调度策略来解决集群计算系统中并行I/O的有效管理问题。提出了两种新的I/O调度算法，并与现有的两种调度方法进行了比较。初步结果表明，提出的政策显著优于现行政策。

引用次数: 5

A slacker coherence protocol for pull-based monitoring of on-line data sources 一种基于拉式在线数据源监测的松弛相干协议

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199375

R. Sundaresan, T. Kurç, Mario Lauria, S. Parthasarathy, J. Saltz

An increasing number of online applications operate on data from disparate, and often wide-spread, data sources. This paper studies the design of a system for the automated monitoring of on-line data sources. In this system a number of ad-hoc data warehouses, which maintain client-specified views, are interposed between clients and data sources. We present a model of coherence, referred to here as slacker coherence, to address the freshness problem in the context of pull-based protocols. We experimentally examine various techniques for estimating update rates and polling adaptively. We also look at the impact on the coherence model performance of the request scheduling algorithm at the source.

越来越多的在线应用程序对来自不同且通常分布广泛的数据源的数据进行操作。本文研究了一个在线数据源自动监控系统的设计。在这个系统中，在客户端和数据源之间插入了许多临时数据仓库，它们维护客户端指定的视图。我们提出了一个一致性模型，这里称为懒散一致性，以解决基于拉的协议背景下的新鲜度问题。我们通过实验研究了各种估计更新率和自适应轮询的技术。我们还研究了请求调度算法在源端对一致性模型性能的影响。

引用次数: 17

Checkpointing and recovery of shared memory parallel applications in a cluster 集群中共享内存并行应用程序的检查点和恢复

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199403

R. Badrinath, C. Morin, Geoffroy R. Vallée

This paper describes issues in the design and implementation of checkpointing and recovery modules for the Kerrighed DSM cluster system. Our design is for a DSM supporting the sequential consistency model. The mechanisms are general enough to be used in a number of different checkpointing and recovery protocols. It is designed to support common optimizations for performance suggested in literature, while staying light-weight during fault free execution. We also present preliminary performance results of the current implementation.

本文介绍了Kerrighed DSM集群系统中检查点和恢复模块的设计与实现问题。我们的设计是用于支持顺序一致性模型的DSM。这些机制足够通用，可以在许多不同的检查点和恢复协议中使用。它旨在支持文献中建议的常见性能优化，同时在无故障执行期间保持轻量级。我们还介绍了目前实施的初步性能结果。

引用次数: 16

Creating services with hard guarantees from cycle-harvesting systems 创建由循环收集系统提供硬保证的服务

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199372

Chris M. Kenyon, G. Cheliotis

Cycle-harvesting is a significant part of the Grid computing landscape. However, creating commercial service contracts based on resources made available by cycle-harvesting is a significant challenge: the characteristics of the harvested resources are inherently stochastic; and secondly, in a commercial environment, purchasers can expect providers to optimize against the quality of service (QoS) definitions. The essential point for creating commercially valuable QoS definitions is to guarantee a set of statistical parameters for each contract instance. Here we describe an appropriate QoS definition, Hard Statistical QoS (HSQ), and show how this can be implemented using a hybrid stochastic-deterministic system. We analyze algorithm behavior analytically using a distribution free approach versus the expected proportion of deterministic resources required for an HSQ specification. We conclude that commercial service contracts based on cycle-harvested resources are viable both from a conceptual point of view and quantitatively.

循环收集是网格计算领域的重要组成部分。然而，基于循环收集提供的资源创建商业服务合同是一个重大挑战:收集的资源的特征本质上是随机的;其次，在商业环境中，购买者可以期望提供者根据服务质量(QoS)定义进行优化。创建具有商业价值的QoS定义的要点是保证为每个契约实例提供一组统计参数。在这里，我们描述了一个适当的QoS定义，硬统计QoS (HSQ)，并展示了如何使用混合随机-确定性系统来实现它。我们使用无分布的方法分析算法行为，而不是HSQ规范所需的确定性资源的预期比例。我们的结论是，基于循环收获资源的商业服务合同从概念上和数量上都是可行的。

引用次数: 21

An overlay-network approach for distributed access to SRS 分布式接入SRS的覆盖网络方法

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199420

T. Fuhrmann, A. Schafferhans, T. Etzold

SRS is a widely used system for integrating biological databases. Currently, SRS relies only on locally provided copies of these databases. In this paper we propose a mechanism that also allows the seamless integration of remote databases. To this end, our proposed mechanism splits the existing SRS functionality into two components and adds a third component that enables us to employ peer-to-peer computing techniques to create optimized overlay-networks within which database queries can efficiently be routed. As an additional benefit, this mechanism also reduces the administration effort that would be needed with a conventional approach using replicated databases.

SRS是一种应用广泛的生物数据库集成系统。目前，SRS仅依赖于这些数据库在当地提供的副本。在本文中，我们提出了一种允许远程数据库无缝集成的机制。为此，我们提出的机制将现有的SRS功能拆分为两个组件，并添加第三个组件，该组件使我们能够使用点对点计算技术来创建优化的覆盖网络，在覆盖网络中可以有效地路由数据库查询。作为一个额外的好处，这种机制还减少了使用复制数据库的传统方法所需的管理工作。

引用次数: 0

Cluster infrastructure for biological and health related research 生物和健康相关研究的集群基础设施

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

Pub Date : 2003-05-12 DOI: 10.1109/CCGRID.2003.1199416

Sophia Corsava, V. Getov

Researchers in the biological and health industries need powerful and stable systems for their work. These systems must be dependable, fault-tolerant, highly available and easy to use. To cope with these demands we propose the use of computational and data clusters in a fail-over configuration combined with the grid technology and job scheduling. Our infrastructure has been deployed successfully for running time-critical applications in commercial environments. We also present experimental results from this pilot implementation that demonstrate the viability of our approach.

生物和健康行业的研究人员需要强大而稳定的系统来进行工作。这些系统必须可靠、容错、高可用性和易于使用。为了应对这些需求，我们提出在故障转移配置中使用计算和数据集群，并结合网格技术和作业调度。我们的基础设施已经成功部署，用于在商业环境中运行时间关键型应用程序。我们还介绍了这个试点实施的实验结果，证明了我们的方法的可行性。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀