High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on最新文献

英文中文

Distributed pagerank for P2P systems P2P系统的分布式网页排名

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 2003-06-22 DOI: 10.1109/HPDC.2003.1210016

K. Sankaralingam, S. Sethumadhavan, J. Browne

This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.

本文定义并描述了一个完全分布式实现的Google的高效网页排名算法，用于“点对点”(P2P)系统。其实现基于线性系统的混沌(异步)迭代解。P2P实现还支持在网络中输入或删除新文档时对网页排名进行增量计算。增量更新可以实现持续准确的网页排名，而目前集中的网络抓取和互联网文档的计算需要几天时间。这表明分布式算法可能适用于网页排名计算，以替代基于Internet文档的集中式Web爬虫实现。对于大型系统来说，就地网络的分布式pagerank计算的完整解决方案可以快速收敛(10次迭代中有1%的精度)，尽管迭代的时间可能很长。增加单个文档导致的增量计算收敛速度非常快，即使对于大型网络和非常精确的解决方案，通常也需要少于15个节点的更新路径长度。pagerank的实现为P2P系统中的文档提供了一种统一的排序方案，它与P2P关键字搜索的集成为解决由于文档点击返回而产生的网络流量问题提供了一种解决方案。在基本的P2P关键字搜索中，必须将所有的文档命中结果返回到查询节点，导致网络流量很大。提出了一种P2P关键字搜索的增量式关键字搜索算法，该算法将文档点击数按pagerank排序，并增量返回到查询节点。将该算法集成到P2P关键字搜索中，无论是对用户的有效性还是对网络流量的减少都能产生显著的效益。增量搜索算法为两个词和三个词的查询提供了大约十倍的网络流量减少。

{"title":"Distributed pagerank for P2P systems","authors":"K. Sankaralingam, S. Sethumadhavan, J. Browne","doi":"10.1109/HPDC.2003.1210016","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210016","url":null,"abstract":"This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for \"peer to peer\" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116224637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 125

PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities PlanetP:利用八卦建立内容可寻址的点对点信息共享社区

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 2003-06-22 DOI: 10.1109/HPDC.2003.1210033

Francisco Matias Cuenca-Acuna, Christopher Peery, R. Martin, Thu D. Nguyen

We introduce PlanetP, content addressable publish/subscribe service for unstructured peer-to-peer (P2P) communities. PlanetP supports content addressing by providing: (1) a gossiping layer used to globally replicate a membership directory and an extremely compact content index; and (2) a completely distributed content search and ranking algorithm that help users find the most relevant information. PlanetP is a simple, yet powerful system for sharing information. PlanetP is simple because each peer must only perform a periodic, randomized, point-to-point message exchange with other peers. PlanetP is powerful because it maintains a globally content-ranked view of the shared data. Using simulation and a prototype implementation, we show that PlanetP achieves ranking accuracy that is comparable to a centralized solution and scales easily to several thousand peers while remaining resilient to rapid membership changes.

我们介绍PlanetP，内容可寻址的发布/订阅服务，用于非结构化的点对点(P2P)社区。PlanetP通过以下方式支持内容寻址:(1)用于全球复制会员目录和极其紧凑的内容索引的八卦层;(2)完全分布式的内容搜索和排名算法，帮助用户找到最相关的信息。PlanetP是一个简单但功能强大的信息共享系统。PlanetP很简单，因为每个对等点只需要与其他对等点执行周期性的、随机的、点对点的消息交换。PlanetP功能强大，因为它维护共享数据的全局内容排序视图。通过仿真和原型实现，我们表明PlanetP实现了与集中式解决方案相当的排名准确性，并且可以轻松扩展到数千个节点，同时保持对快速成员变更的弹性。

引用次数: 369

QoS-aware middleware for cluster-based servers to support interactive and resource-adaptive applications 用于基于集群的服务器的qos感知中间件，以支持交互式和资源自适应的应用程序

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 2003-06-22 DOI: 10.1109/HPDC.2003.1210030

S. Senapathi, B. Chandrasekaran, D. Stredney, Han-Wei Shen, D. Panda

Advances in commodity processor and network technologies have made cluster-based servers very attractive for supporting a large number of interactive applications (such as visualization and data mining) in the domains of Grid computing and distributed computing. These applications involve accesses to huge amounts of data within the servers and heavy computations on the accessed data before sending out the results to the clients. The interactive nature of these applications requires some kind of QoS support (such as guarantees on response time) from the underlying server. Unfortunately, the current generation cluster-based servers with the popular interconnect (Gigabit Ethernet, Myrinet, or Quadrics) do not provide any kinds of QoS support. Fortunately, many of these applications are resource-adaptive, i.e., application parameters can be changed to suit user demands and available system resources. To solve these problems, a new QoS-aware middleware layer is proposed in this paper for cluster-based servers with Myrinet interconnect. The middleware is built on top of a simple NIC-based rate control scheme that provides proportional bandwidth allocation. Three major components of the middleware (profiler, QoS translator, and resource allocator), their functionalities, designs, and the associated algorithms are presented. These components work together to execute a requested job in a predictable manner with an efficient allocation of system resources while exploiting the resource-adaptive property of the application. The complete middleware is designed, developed, and implemented on a Myrinet cluster. It is evaluated for two visualization applications: polygon rendering and ray-tracing. Experimental evaluations demonstrate that the proposed QoS framework enables multiple interactive and resource-adaptive applications to be executed in a predictable manner while keeping the allocation of system resources efficient. It is shown that the QoS-aware middleware helps applications to obtain response times within 7% of the expected times, compared to increases of up to 117% in the absence of any QoS support.

商用处理器和网络技术的进步使得基于集群的服务器在支持网格计算和分布式计算领域中的大量交互式应用程序(如可视化和数据挖掘)方面非常有吸引力。这些应用程序涉及访问服务器内的大量数据，并在将结果发送给客户端之前对访问的数据进行大量计算。这些应用程序的交互特性需要底层服务器提供某种类型的QoS支持(例如对响应时间的保证)。不幸的是，具有流行互连(千兆以太网、Myrinet或Quadrics)的当前一代基于集群的服务器不提供任何类型的QoS支持。幸运的是，这些应用程序中的许多都是资源自适应的，也就是说，可以更改应用程序参数以适应用户需求和可用的系统资源。为了解决这些问题，本文提出了一种新的qos感知中间件层，用于集群服务器与Myrinet互连。中间件构建在一个简单的基于nic的速率控制方案之上，该方案提供成比例的带宽分配。介绍了中间件的三个主要组件(分析器、QoS转换器和资源分配器)及其功能、设计和相关算法。这些组件一起工作，以可预测的方式执行请求的作业，有效地分配系统资源，同时利用应用程序的资源自适应属性。完整的中间件是在Myrinet集群上设计、开发和实现的。它在两种可视化应用中进行了评估:多边形渲染和光线跟踪。实验评估表明，所提出的QoS框架使多个交互和资源自适应应用程序能够以可预测的方式执行，同时保持系统资源的高效分配。结果表明，支持QoS的中间件帮助应用程序在预期时间的7%内获得响应时间，而在没有任何QoS支持的情况下，响应时间最多可增加117%。

{"title":"QoS-aware middleware for cluster-based servers to support interactive and resource-adaptive applications","authors":"S. Senapathi, B. Chandrasekaran, D. Stredney, Han-Wei Shen, D. Panda","doi":"10.1109/HPDC.2003.1210030","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210030","url":null,"abstract":"Advances in commodity processor and network technologies have made cluster-based servers very attractive for supporting a large number of interactive applications (such as visualization and data mining) in the domains of Grid computing and distributed computing. These applications involve accesses to huge amounts of data within the servers and heavy computations on the accessed data before sending out the results to the clients. The interactive nature of these applications requires some kind of QoS support (such as guarantees on response time) from the underlying server. Unfortunately, the current generation cluster-based servers with the popular interconnect (Gigabit Ethernet, Myrinet, or Quadrics) do not provide any kinds of QoS support. Fortunately, many of these applications are resource-adaptive, i.e., application parameters can be changed to suit user demands and available system resources. To solve these problems, a new QoS-aware middleware layer is proposed in this paper for cluster-based servers with Myrinet interconnect. The middleware is built on top of a simple NIC-based rate control scheme that provides proportional bandwidth allocation. Three major components of the middleware (profiler, QoS translator, and resource allocator), their functionalities, designs, and the associated algorithms are presented. These components work together to execute a requested job in a predictable manner with an efficient allocation of system resources while exploiting the resource-adaptive property of the application. The complete middleware is designed, developed, and implemented on a Myrinet cluster. It is evaluated for two visualization applications: polygon rendering and ray-tracing. Experimental evaluations demonstrate that the proposed QoS framework enables multiple interactive and resource-adaptive applications to be executed in a predictable manner while keeping the allocation of system resources efficient. It is shown that the QoS-aware middleware helps applications to obtain response times within 7% of the expected times, compared to increases of up to 117% in the absence of any QoS support.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127067045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Pipeline and batch sharing in grid workloads 网格工作负载中的管道和批处理共享

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 2003-06-22 DOI: 10.1109/HPDC.2003.1210025

D. Thain, John Bent, A. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, M. Livny

We present a study of six batch-pipeline scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for communication and also share measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches. We conclude with a discussion of the ramifications of these workloads for end-to-end scalability and overall system design.

我们提出了六个批处理管道科学工作负载的研究，这些工作负载是在计算网格上执行的候选者。尽管其他研究关注单个应用程序的行为，但本研究描述了由使用文件存储进行通信的顺序进程管道组成的工作负载，并且还共享内存、CPU和单个组件的I/O需求的测量，以及完整批处理中I/O共享的分析。最后，我们将讨论这些工作负载对端到端可伸缩性和整体系统设计的影响。

引用次数: 67

A performance study of monitoring and information services for distributed systems 分布式系统监控与信息服务的性能研究

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 2003-04-10 DOI: 10.1109/HPDC.2003.1210036

Xuehai Zhang, Jeffrey L. Freschl, J. Schopf

Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the monitoring system, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit/spl reg/ Monitoring and Discovery Service (MDS2), the European Data Grid Relational Grid Monitoring Architecture (R-GMA) and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different circumstances, we found a strong advantage to caching or pre-fetching the data, as well as the need to have primary components at well-connected sites because of the high load seen by all systems.

监控和信息服务是分布式系统或网格的关键组成部分。对这些服务进行定量研究有助于了解性能限制，为监测系统的部署提供建议，并有助于评估未来的开发工作。为此，我们研究了三种分布式系统监控和信息服务的性能:Globus Toolkit/spl reg/ monitoring and Discovery Service (MDS2)、European Data Grid Relational Grid monitoring Architecture (R-GMA)和鹰眼(Hawkeye)，这是秃鹰项目的一部分。我们执行实验来测试它们在用户数量、资源数量和收集的数据量方面的可伸缩性。我们的研究表明，每种方法都有不同的行为，通常是由于它们的设计目标不同。在我们进行的四组实验中，我们评估了不同情况下服务组件的性能，我们发现缓存或预获取数据具有很强的优势，并且需要在连接良好的站点上拥有主要组件，因为所有系统都看到了高负载。

{"title":"A performance study of monitoring and information services for distributed systems","authors":"Xuehai Zhang, Jeffrey L. Freschl, J. Schopf","doi":"10.1109/HPDC.2003.1210036","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210036","url":null,"abstract":"Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the monitoring system, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit/spl reg/ Monitoring and Discovery Service (MDS2), the European Data Grid Relational Grid Monitoring Architecture (R-GMA) and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different circumstances, we found a strong advantage to caching or pre-fetching the data, as well as the need to have primary components at well-connected sites because of the high load seen by all systems.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125709890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 233

Proceedings 12th IEEE International Symposium on High Performance Distributed Computing 第十二届IEEE高性能分布式计算国际研讨会论文集

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

Pub Date : 1900-01-01 DOI: 10.1109/HPDC.2003.1210010

The following topics are dealt with: fast communication and data grids; security and novel applications; resource management; application scheduling; fault tolerance; workload characterization; interactive Grids and quality of service (QoS); resource discovery; and resource monitoring.

讨论了以下主题:快速通信和数据网格;安全性和新应用;资源管理;应用程序调度;容错;负载特性;交互式网格和服务质量(QoS);资源发现;资源监控。

引用次数: 11

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀