Proceedings the Ninth International Symposium on High-Performance Distributed Computing最新文献

英文中文

Development of Web toolkits for computational science portals: the NPACI HotPage 计算科学门户网站的Web工具包开发:NPACI HotPage

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868671

Mary P. Thomas, S. Mock, J. Boisseau

The NPACI (National Partnership for Advanced Computational Infrastructure) HotPage is a user portal that provides views of a distributed set of high-performance computing (HPC) resources as either an integrated meta-system or as individual machines. These Web pages run on any Web browser, regardless of the system or geographical location, and are supported by secure, encrypted log-in sessions where authenticated users can access their HPC system accounts and perform basic computational tasks. We describe the development of the Grid Portals Toolkit (GridPort), which is based on the architecture developed for the HotPage, and provide computational scientists and application developers with a set of simple, modular services and tools that allow application-level, customized science portal development and that facilitate seamless Web-based access to distributed computational resources and Grid services.

NPACI(国家高级计算基础设施伙伴关系)HotPage是一个用户门户，它提供了一组分布式高性能计算(HPC)资源的视图，这些资源可以是集成的元系统，也可以是单独的机器。这些Web页面可以在任何Web浏览器上运行，与系统或地理位置无关，并且由安全的加密登录会话支持，经过身份验证的用户可以访问其HPC系统帐户并执行基本的计算任务。我们描述了网格门户工具包(GridPort)的开发，它基于为HotPage开发的体系结构，并为计算科学家和应用程序开发人员提供了一套简单的、模块化的服务和工具，这些服务和工具允许应用程序级的、定制的科学门户开发，并促进基于web的对分布式计算资源和网格服务的无缝访问。

引用次数: 66

Application placement using performance surfaces 使用性能表面放置应用程序

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868654

A. Turgeon, Q. Snell, M. Clement

Heterogeneous parallel clusters of workstations are being used to solve many important computational problems. Scheduling parallel applications on the best collection of machines in a heterogeneous computing environment is a complex problem. Performance prediction is vital to good application performance in this environment since utilization of an ill-suited machine can slow the computation down significantly. The heterogeneity of the different pieces composing the parallel platform (network links, CPU, memory, and OS) makes it incredibly difficult to accurately predict performance. This paper addresses the problem of network performance prediction. Since communication speed is often the bottleneck for parallel application perfomance, network performance prediction is important to the overall performance prediction problem. A new methodology for characterizing network links and application's need for network resources is developed which makes use of performance surfaces (Clement et al., 1998). Mathematical operations on the performance surfaces are introduced that calculate an application's affinity for a network configuration. These affinity measures can be used for the scheduling of parallel applications.

异构并行工作站集群正被用于解决许多重要的计算问题。在异构计算环境中，在最佳机器集合上调度并行应用程序是一个复杂的问题。在这种环境中，性能预测对于良好的应用程序性能至关重要，因为使用不合适的机器会大大降低计算速度。组成并行平台的不同部分(网络链接、CPU、内存和操作系统)的异构性使得准确预测性能变得异常困难。本文主要研究网络性能预测问题。由于通信速度通常是并行应用程序性能的瓶颈，因此网络性能预测对于整体性能预测问题非常重要。一种利用性能面来描述网络链接和应用程序对网络资源需求的新方法被开发出来(Clement et al.， 1998)。介绍了性能面上的数学运算，用于计算应用程序对网络配置的亲和力。这些关联度量可用于并行应用程序的调度。

{"title":"Application placement using performance surfaces","authors":"A. Turgeon, Q. Snell, M. Clement","doi":"10.1109/HPDC.2000.868654","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868654","url":null,"abstract":"Heterogeneous parallel clusters of workstations are being used to solve many important computational problems. Scheduling parallel applications on the best collection of machines in a heterogeneous computing environment is a complex problem. Performance prediction is vital to good application performance in this environment since utilization of an ill-suited machine can slow the computation down significantly. The heterogeneity of the different pieces composing the parallel platform (network links, CPU, memory, and OS) makes it incredibly difficult to accurately predict performance. This paper addresses the problem of network performance prediction. Since communication speed is often the bottleneck for parallel application perfomance, network performance prediction is important to the overall performance prediction problem. A new methodology for characterizing network links and application's need for network resources is developed which makes use of performance surfaces (Clement et al., 1998). Mathematical operations on the performance surfaces are introduced that calculate an application's affinity for a network configuration. These affinity measures can be used for the scheduling of parallel applications.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131926755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Event services for high performance computing 用于高性能计算的事件服务

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868641

G. Eisenhauer, F. Bustamante, K. Schwan

The Internet and the Grid are changing the face of high-performance computing. Rather than tightly-coupled SPMD-style components running in a single cluster, on a parallel machine, or even on the Internet programmed in MPI, applications are evolving into sets of collaborating components scattered across diverse computational elements. These collaborating components may run on different operating systems and hardware platforms and may be written by different organizations in different languages. Complete "applications" are constructed by assembling these components in a plug-and-play fashion. This new vision for high-performance computing demands features and characteristics which are not easily provided by traditional high-performance communications middleware. In response to these needs, we have developed ECho, a high-performance event-delivery middleware that meets the new demands of the Grid environment. ECho provides efficient binary transmission of event data with unique features that support data-type discovery and enterprise-scale application evolution. We present measurements detailing ECho's performance to show that ECho significantly outperforms other systems intended to provide this functionality, and that it provides throughput and latency comparable to the most efficient middleware infrastructures available.

互联网和网格正在改变高性能计算的面貌。应用程序不再是紧耦合的spmd风格的组件运行在单个集群、并行机器上，甚至是在MPI编程的Internet上，而是演变成分散在不同计算元素中的协作组件集。这些协作组件可以运行在不同的操作系统和硬件平台上，并且可以由不同的组织用不同的语言编写。通过以即插即用的方式组装这些组件来构建完整的“应用程序”。这种高性能计算的新愿景需要传统高性能通信中间件不容易提供的特性和特性。为了响应这些需求，我们开发了ECho，这是一种高性能事件交付中间件，可以满足网格环境的新需求。ECho提供高效的事件数据二进制传输，具有支持数据类型发现和企业级应用程序发展的独特特性。我们提供了详细的ECho性能测量，以表明ECho的性能明显优于旨在提供此功能的其他系统，并且它提供的吞吐量和延迟可与最有效的中间件基础设施相媲美。

{"title":"Event services for high performance computing","authors":"G. Eisenhauer, F. Bustamante, K. Schwan","doi":"10.1109/HPDC.2000.868641","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868641","url":null,"abstract":"The Internet and the Grid are changing the face of high-performance computing. Rather than tightly-coupled SPMD-style components running in a single cluster, on a parallel machine, or even on the Internet programmed in MPI, applications are evolving into sets of collaborating components scattered across diverse computational elements. These collaborating components may run on different operating systems and hardware platforms and may be written by different organizations in different languages. Complete \"applications\" are constructed by assembling these components in a plug-and-play fashion. This new vision for high-performance computing demands features and characteristics which are not easily provided by traditional high-performance communications middleware. In response to these needs, we have developed ECho, a high-performance event-delivery middleware that meets the new demands of the Grid environment. ECho provides efficient binary transmission of event data with unique features that support data-type discovery and enterprise-scale application evolution. We present measurements detailing ECho's performance to show that ECho significantly outperforms other systems intended to provide this functionality, and that it provides throughput and latency comparable to the most efficient middleware infrastructures available.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114150779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 94

The Modeler's Workbench: a system for dynamically distributed simulation and data collection 建模者的工作台:一个用于动态分布仿真和数据收集的系统

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868667

Daniel Andresen, R. Novotny

A fundamental problem in today's research environment, particularly in the biological and environmental sciences, is the inability of researchers to perform even simple investigations due to the inaccessibility and essential difficulty in acquiring and utilizing the necessary data. Often the data and simulation models are available via the Internet, but through a combination of obscurity, incompatibility and inefficiency, they are essentially unusable. Our system - the Modeler's Workbench - aims to address these fundamental problems. Novel aspects of our system include the use of XML as both a resource description language and a machine-independent data transfer mechanism. We build on top of existing metacomputing tools, while providing an infrastructure based on Java wrappers around existing simulations, to provide compatibility with our infrastructure while retaining legacy code and allowing for machine-dependent optimizations. We have developed a prototype of our system, for which we present the design and experimental data. We also discuss our future plans.

在当今的研究环境中，特别是在生物和环境科学中，一个根本的问题是，由于难以获取和利用必要的数据，研究人员甚至无法进行简单的调查。数据和仿真模型通常可以通过Internet获得，但是由于模糊、不兼容和效率低下，它们基本上是不可用的。我们的系统——建模者的工作台——旨在解决这些基本问题。我们系统的新颖之处包括使用XML作为资源描述语言和独立于机器的数据传输机制。我们在现有元计算工具的基础上进行构建，同时提供基于现有模拟的Java包装器的基础设施，以提供与基础设施的兼容性，同时保留遗留代码并允许依赖于机器的优化。我们已经开发了我们的系统的原型，我们给出了设计和实验数据。我们还讨论了未来的计划。

引用次数: 1

Creating large scale database servers 创建大型数据库服务器

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868659

J. Becla, A. Hanushevsky

The BaBar experiment at the Stanford Linear Accelerator Center (SLAC) is designed to perform a high precision investigation of the decays of B-meson produced from electron-positron interactions. The experiment, started in May 1999, will generate approximately 300 TB/year of data for 10 years. All of the data will reside in objectivity databases (object oriented databases), accessible via the Advanced Multi-threaded Server (AMS). To date, over 70 TB of data have been placed in Objectivity/DB, making it one of the largest databases in the world. Providing access to such a large quantity of data through a database server is a daunting task. A full-scale testbed environment had to be developed to tune various software parameters and a fundamental change had to occur in the AMS architecture to allow it to scale past several hundred terabytes of data. Additionally, several protocol extensions had to be implemented to provide practical access to large quantities of data. The paper describes the design of the database and the changes that we needed to make in the AMS for scalability reasons and how the lessons we learned would be applicable to virtually any kind of database server seeking to operate in the Petabyte region.

斯坦福直线加速器中心(SLAC)的BaBar实验旨在对电子-正电子相互作用产生的b介子衰变进行高精度研究。这项实验于1999年5月开始，将在10年内每年产生大约300 TB的数据。所有数据将驻留在客观性数据库(面向对象数据库)中，通过高级多线程服务器(Advanced Multi-threaded Server, AMS)访问。迄今为止，已有超过70 TB的数据被存放在客观性/数据库中，使其成为世界上最大的数据库之一。通过数据库服务器提供对如此大量数据的访问是一项艰巨的任务。必须开发一个全面的测试平台环境来调整各种软件参数，并且必须对AMS体系结构进行根本性的更改，以允许其扩展到数百tb以上的数据。此外，必须实现几个协议扩展，以提供对大量数据的实际访问。本文描述了数据库的设计，以及出于可伸缩性的原因我们需要在AMS中进行的更改，以及我们所吸取的经验教训如何适用于几乎任何一种试图在pb级区域中运行的数据库服务器。

{"title":"Creating large scale database servers","authors":"J. Becla, A. Hanushevsky","doi":"10.1109/HPDC.2000.868659","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868659","url":null,"abstract":"The BaBar experiment at the Stanford Linear Accelerator Center (SLAC) is designed to perform a high precision investigation of the decays of B-meson produced from electron-positron interactions. The experiment, started in May 1999, will generate approximately 300 TB/year of data for 10 years. All of the data will reside in objectivity databases (object oriented databases), accessible via the Advanced Multi-threaded Server (AMS). To date, over 70 TB of data have been placed in Objectivity/DB, making it one of the largest databases in the world. Providing access to such a large quantity of data through a database server is a daunting task. A full-scale testbed environment had to be developed to tune various software parameters and a fundamental change had to occur in the AMS architecture to allow it to scale past several hundred terabytes of data. Additionally, several protocol extensions had to be implemented to provide practical access to large quantities of data. The paper describes the design of the database and the changes that we needed to make in the AMS for scalability reasons and how the lessons we learned would be applicable to virtually any kind of database server seeking to operate in the Petabyte region.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114618299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Incorporating job migration and network RAM to share cluster memory resources 结合作业迁移和网络RAM来共享集群内存资源

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868636

Li Xiao, Xiaodong Zhang, Stefan A. Kubricht

Job migrations and network RAM are two approaches for effectively using global memory resources in a workstation cluster, aimed at reducing page faults in each local workstation and improving the overall performance of cluster computing. Using either remote executions or pre-emptive migrations, a load-sharing system is able to migrate a job from a workstation without sufficient memory space to a lightly loaded workstation with a large idle memory space for the migrated job. In a network RAM system, if a job cannot find sufficient memory space for its working sets, it utilizes idle memory space from other workstations in the cluster through remote paging. Conducting trace-driven simulations, we have compared the performance and tradeoffs of the two approaches and their impacts on job execution time and cluster scalability. Job migration-based load-sharing schemes are able to balance executions of jobs in a cluster well, while network RAM is able to satisfy data-intensive jobs which may not be migratable by sharing all the idle memory resources in a cluster. A network RAM cluster of workstations is scalable only if the network is sufficiently fast. We propose an improved load-sharing scheme by combining job migrations with network RAM for cluster computing. This scheme uses remote execution to initially allocate a job to the most lightly loaded workstation and, if necessary, network RAM to provide a larger memory space for the job than would be available otherwise. The improved scheme has the merits of both job migrations and network RAM. Our experiments show its effectiveness and scalability for cluster computing.

作业迁移和网络RAM是有效利用工作站集群全局内存资源的两种方法，旨在减少每个本地工作站的页面错误，提高集群计算的整体性能。通过使用远程执行或抢占式迁移，负载共享系统能够将作业从没有足够内存空间的工作站迁移到具有大量空闲内存空间的轻负载工作站。在网络RAM系统中，如果作业无法为其工作集找到足够的内存空间，它将通过远程分页利用集群中其他工作站的空闲内存空间。通过跟踪驱动的模拟，我们比较了这两种方法的性能和权衡，以及它们对作业执行时间和集群可伸缩性的影响。基于作业迁移的负载共享方案能够很好地平衡集群中作业的执行，而网络RAM能够通过共享集群中的所有空闲内存资源来满足可能无法迁移的数据密集型作业。工作站的网络RAM集群只有在网络足够快的情况下才可扩展。我们提出了一种改进的负载共享方案，将作业迁移与网络RAM相结合用于集群计算。该方案使用远程执行将作业初始分配给负载最轻的工作站，如果有必要，还使用网络RAM为作业提供比其他方式更大的内存空间。改进方案具有作业迁移和网络内存的优点。实验证明了该算法在集群计算中的有效性和可扩展性。

{"title":"Incorporating job migration and network RAM to share cluster memory resources","authors":"Li Xiao, Xiaodong Zhang, Stefan A. Kubricht","doi":"10.1109/HPDC.2000.868636","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868636","url":null,"abstract":"Job migrations and network RAM are two approaches for effectively using global memory resources in a workstation cluster, aimed at reducing page faults in each local workstation and improving the overall performance of cluster computing. Using either remote executions or pre-emptive migrations, a load-sharing system is able to migrate a job from a workstation without sufficient memory space to a lightly loaded workstation with a large idle memory space for the migrated job. In a network RAM system, if a job cannot find sufficient memory space for its working sets, it utilizes idle memory space from other workstations in the cluster through remote paging. Conducting trace-driven simulations, we have compared the performance and tradeoffs of the two approaches and their impacts on job execution time and cluster scalability. Job migration-based load-sharing schemes are able to balance executions of jobs in a cluster well, while network RAM is able to satisfy data-intensive jobs which may not be migratable by sharing all the idle memory resources in a cluster. A network RAM cluster of workstations is scalable only if the network is sufficiently fast. We propose an improved load-sharing scheme by combining job migrations with network RAM for cluster computing. This scheme uses remote execution to initially allocate a job to the most lightly loaded workstation and, if necessary, network RAM to provide a larger memory space for the job than would be available otherwise. The improved scheme has the merits of both job migrations and network RAM. Our experiments show its effectiveness and scalability for cluster computing.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129134690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

A component based services architecture for building distributed applications 用于构建分布式应用程序的基于组件的服务体系结构

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868634

R. Bramley, K. Chiu, S. Diwan, Dennis Gannon, M. Govindaraju, N. Mukhi, B. Temko, Madhuri Yechuri

Describes an approach to building a distributed software component system for scientific and engineering applications that is based on representing Computational Grid services as application-level software components. These Grid services provide tools such as registry and directory services, event services and remote component creation. While a service-based architecture for grids and other distributed systems is not new, this framework provides several unique features. First, the public interfaces to each software component are described as XML documents. This allows many adaptors and user interfaces to be generated from the specification dynamically. Second, this system is designed to exploit the resources of existing Grid infrastructures like Globus and Legion, and commercial Internet frameworks like e-speak. Third, and most important, the component-based design extends throughout the system. Hence, tools such as application builders, which allow users to select components, start them on remote resources, and connect and execute them, are also interchangeable software components. Consequently, it is possible to build distributed applications using a graphical "drag-and-drop" interface, a Web-based interface, a scripting language like Python, or an existing tool such as Matlab.

描述为科学和工程应用程序构建分布式软件组件系统的方法，该方法基于将计算网格服务表示为应用程序级软件组件。这些网格服务提供诸如注册中心和目录服务、事件服务和远程组件创建之类的工具。虽然网格和其他分布式系统的基于服务的体系结构并不新鲜，但该框架提供了几个独特的特性。首先，将每个软件组件的公共接口描述为XML文档。这允许从规范动态生成许多适配器和用户界面。其次，该系统旨在利用现有网格基础设施(如Globus和Legion)和商业互联网框架(如e-speak)的资源。第三，也是最重要的，基于组件的设计扩展到整个系统。因此，允许用户选择组件、在远程资源上启动组件、连接并执行组件的应用程序构建器等工具也是可互换的软件组件。因此，可以使用图形化的“拖放”界面、基于web的界面、脚本语言(如Python)或现有工具(如Matlab)来构建分布式应用程序。

{"title":"A component based services architecture for building distributed applications","authors":"R. Bramley, K. Chiu, S. Diwan, Dennis Gannon, M. Govindaraju, N. Mukhi, B. Temko, Madhuri Yechuri","doi":"10.1109/HPDC.2000.868634","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868634","url":null,"abstract":"Describes an approach to building a distributed software component system for scientific and engineering applications that is based on representing Computational Grid services as application-level software components. These Grid services provide tools such as registry and directory services, event services and remote component creation. While a service-based architecture for grids and other distributed systems is not new, this framework provides several unique features. First, the public interfaces to each software component are described as XML documents. This allows many adaptors and user interfaces to be generated from the specification dynamically. Second, this system is designed to exploit the resources of existing Grid infrastructures like Globus and Legion, and commercial Internet frameworks like e-speak. Third, and most important, the component-based design extends throughout the system. Hence, tools such as application builders, which allow users to select components, start them on remote resources, and connect and execute them, are also interchangeable software components. Consequently, it is possible to build distributed applications using a graphical \"drag-and-drop\" interface, a Web-based interface, a scripting language like Python, or an existing tool such as Matlab.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129241157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 111

Failure-atomic file access in an interposed network storage system 插入式网络存储系统中的故障原子文件访问

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868646

Darrell C. Anderson, J. Chase

Presents a recovery protocol for block I/O operations in Slice, a storage system architecture for high-speed LANs incorporating network-attached block storage. The goal of the Slice architecture is to provide a network file service with scalable bandwidth and capacity while preserving compatibility with off-the-shelf clients and file server appliances. The Slice prototype "virtualizes" the Network File System (NFS) protocol by interposing a request switching filter at the client's interface to the network storage system (e.g. in a network adapter or switch). The distributed Slice architecture separates functions that are typically combined in central file servers, introducing new challenges for failure atomicity. This paper presents a protocol for atomic file operations and recovery in the Slice architecture, and related support for reliable file storage using mirrored striping. Experimental results from the Slice prototype show that the protocol has low cost in the common case, allowing the system to deliver client file access bandwidths approaching Gbit/s network speeds.

介绍了一种用于高速局域网的包含网络附加块存储的存储系统体系结构Slice中块I/O操作的恢复协议。Slice架构的目标是提供具有可伸缩带宽和容量的网络文件服务，同时保持与现有客户机和文件服务器设备的兼容性。Slice原型通过在客户端与网络存储系统的接口(例如在网络适配器或交换机中)插入一个请求交换过滤器来“虚拟化”网络文件系统(NFS)协议。分布式Slice体系结构分离了通常在中央文件服务器中组合的功能，这给故障原子性带来了新的挑战。本文提出了一种Slice架构中的原子文件操作和恢复协议，以及使用镜像分条对可靠文件存储的相关支持。Slice原型的实验结果表明，该协议在一般情况下具有较低的成本，允许系统提供接近Gbit/s的网络速度的客户端文件访问带宽。

引用次数: 11

Robust resource management for metacomputers 元计算机健壮的资源管理

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868640

J. Gehring, A. Streit

Presents a robust software infrastructure for metacomputing. The system is intended to be used by others as a building block for large and powerful computational grids. Much effort has been taken to develop a fault-tolerant architecture that does not exhibit a single point of failure. Furthermore, we have designed the system to be modular, lean and portable. It is available as open source code and has been successfully compiled on POSIX- and Microsoft Windows-compliant platforms. The system does not originate from a laboratory environment but has proven its robustness within two large metacomputing installations. It embodies a modular concept which allows easy integration of new or modified components. Hence, it is not necessary to buy into the system as whole. We rather encourage others to use only those components that fit into their specific environments.

为元计算提供了一个健壮的软件基础结构。该系统旨在被其他人用作大型和强大的计算网格的构建块。为了开发不出现单点故障的容错体系结构，已经付出了很多努力。此外，我们还设计了模块化，精简和便携的系统。它作为开源代码提供，并已成功地在POSIX和Microsoft windows兼容平台上进行了编译。该系统并非来自实验室环境，但已在两个大型元计算装置中证明了其健壮性。它体现了一个模块化的概念，可以很容易地集成新的或修改的组件。因此，没有必要购买整个系统。我们鼓励其他人只使用适合其特定环境的组件。

引用次数: 32

Data mining on NASA's Information Power Grid NASA信息电网的数据挖掘

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868663

T. Hinke, Jason Novotny

The paper describes the development of a data mining system that is to operate on NASA's Information Power Grid (IPG). Mining agents will be staged to one or more processors on the IPG. There they will grow using just-in-time acquisition of new operations. They will mine data delivered using just-in-time delivery. Some initial experimental results are presented.

本文描述了一个数据挖掘系统的开发，该系统将在NASA的信息电网(IPG)上运行。挖掘代理将被分级到IPG上的一个或多个处理器。在那里，他们将通过及时收购新业务来实现增长。他们将挖掘使用即时交付交付的数据。给出了一些初步的实验结果。

引用次数: 48

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀