Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868668
Xingfu Wu, V. Taylor, J. Geisler, X. Li, Z. Lan, R. Stevens, M. Hereld, I. Judson
Efficient execution of applications requires insight into how the system features impact the performance of the application. For distributed systems, the task of gaining this insight is complicated by the complexity of the system features. This insight generally results from significant experimental analysis and possibly the development of performance models. This paper presents the Prophesy project, an infrastructure that aids in gaining this needed insight based upon experience. The core component of Prophesy is a relational database that allows for the recording of performance data, system features and application details.
{"title":"Prophesy: an infrastructure for analyzing and modeling the performance of parallel and distributed applications","authors":"Xingfu Wu, V. Taylor, J. Geisler, X. Li, Z. Lan, R. Stevens, M. Hereld, I. Judson","doi":"10.1109/HPDC.2000.868668","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868668","url":null,"abstract":"Efficient execution of applications requires insight into how the system features impact the performance of the application. For distributed systems, the task of gaining this insight is complicated by the complexity of the system features. This insight generally results from significant experimental analysis and possibly the development of performance models. This paper presents the Prophesy project, an infrastructure that aids in gaining this needed insight based upon experience. The core component of Prophesy is a relational database that allows for the recording of performance data, system features and application details.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115066595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868660
K. Hwang, Hai Jin, Roy S. C. Ho
A new RAID-x (redundant array of inexpensive disks at level x) architecture is presented for distributed I/O processing on a serverless cluster of computers. The RAID-x architecture is based on a new concept of orthogonal striping and mirroring (OSM) across all distributed disks in the cluster. The primary advantages of this OSM approach lie in: (1) a significant improvement in parallel I/O bandwidth; (2) hiding disk mirroring overhead in the background; and (3) greatly enhanced scalability and reliability in cluster computing applications. All claimed advantages are substantiated with benchmark performance results on the Trojans cluster built at USC in 1999. The authors discuss the issues of scalable I/O performance, enhanced system reliability, and striped checkpointing on distributed RAID-x in a serverless cluster environment.
{"title":"RAID-x: a new distributed disk array for I/O-centric cluster computing","authors":"K. Hwang, Hai Jin, Roy S. C. Ho","doi":"10.1109/HPDC.2000.868660","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868660","url":null,"abstract":"A new RAID-x (redundant array of inexpensive disks at level x) architecture is presented for distributed I/O processing on a serverless cluster of computers. The RAID-x architecture is based on a new concept of orthogonal striping and mirroring (OSM) across all distributed disks in the cluster. The primary advantages of this OSM approach lie in: (1) a significant improvement in parallel I/O bandwidth; (2) hiding disk mirroring overhead in the background; and (3) greatly enhanced scalability and reliability in cluster computing applications. All claimed advantages are substantiated with benchmark performance results on the Trojans cluster built at USC in 1999. The authors discuss the issues of scalable I/O performance, enhanced system reliability, and striped checkpointing on distributed RAID-x in a serverless cluster environment.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128999616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868630
Fangzhe Chang, V. Karamcheti
Increased platform heterogeneity and varying resource availability in distributed systems motivates the design of resource-aware applications, which ensure a desired performance level by continuously adapting their behavior to changing resource characteristics. In this paper, we describe an application-independent adaptation framework that simplifies the design of resource-aware applications. This framework eliminates the need for adaptation decisions to be explicitly programmed into the application by relying on two novel components: (1) a tunability interface, which exposes adaptation choices in the form of alternate application configurations while encapsulating core application functionality, and (2) a virtual execution environment, which emulates application execution under diverse resource availability enabling off-line collection of information about the resulting behavior. Together, these components permit automatic run-time decisions on when to adapt by continuously monitoring resource conditions and application progress, and how to adapt by dynamically choosing the application configuration that is most appropriate for the prescribed user preference. We evaluate the framework using an interactive distributed image visualization application. The framework permits automatic adaptation to changes in CPU load and network bandwidth by choosing a different compression algorithm or by controlling the image transmission sequence so as to satisfy user preferences of visualization quality and timeliness.
{"title":"Automatic configuration and run-time adaptation of distributed applications","authors":"Fangzhe Chang, V. Karamcheti","doi":"10.1109/HPDC.2000.868630","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868630","url":null,"abstract":"Increased platform heterogeneity and varying resource availability in distributed systems motivates the design of resource-aware applications, which ensure a desired performance level by continuously adapting their behavior to changing resource characteristics. In this paper, we describe an application-independent adaptation framework that simplifies the design of resource-aware applications. This framework eliminates the need for adaptation decisions to be explicitly programmed into the application by relying on two novel components: (1) a tunability interface, which exposes adaptation choices in the form of alternate application configurations while encapsulating core application functionality, and (2) a virtual execution environment, which emulates application execution under diverse resource availability enabling off-line collection of information about the resulting behavior. Together, these components permit automatic run-time decisions on when to adapt by continuously monitoring resource conditions and application progress, and how to adapt by dynamically choosing the application configuration that is most appropriate for the prescribed user preference. We evaluate the framework using an interactive distributed image visualization application. The framework permits automatic adaptation to changes in CPU load and network bandwidth by choosing a different compression algorithm or by controlling the image transmission sequence so as to satisfy user preferences of visualization quality and timeliness.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115796947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868651
Fabio Kon, R. Campbell, M. D. Mickunas, K. Nahrstedt, Francisco J. Ballesteros
The first decades of the new millennium will witness an explosive growth in the number and diversity of networked devices and portals. We foresee high degrees of mobility, heterogeneity, and interactions among computing devices connected to global networks. While previous research in distributed operating systems solved many problems related to resource management, they seldom addressed the problems of heterogeneity and dynamic adaptability. On the other hand, middleware solutions, like CORBA and Java/Jini, solve part of the heterogeneity problem by permitting seamless communication among different platforms. But, they do not address dynamic resource management and adaptability for applications requiring high-performance distributed computing. This paper presents 2K, an integrated operating system architecture that addresses the problems of resource management in heterogeneous networks, dynamic adaptability and configuration of component-based distributed applications.
{"title":"2K: a distributed operating system for dynamic heterogeneous environments","authors":"Fabio Kon, R. Campbell, M. D. Mickunas, K. Nahrstedt, Francisco J. Ballesteros","doi":"10.1109/HPDC.2000.868651","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868651","url":null,"abstract":"The first decades of the new millennium will witness an explosive growth in the number and diversity of networked devices and portals. We foresee high degrees of mobility, heterogeneity, and interactions among computing devices connected to global networks. While previous research in distributed operating systems solved many problems related to resource management, they seldom addressed the problems of heterogeneity and dynamic adaptability. On the other hand, middleware solutions, like CORBA and Java/Jini, solve part of the heterogeneity problem by permitting seamless communication among different platforms. But, they do not address dynamic resource management and adaptability for applications requiring high-performance distributed computing. This paper presents 2K, an integrated operating system architecture that addresses the problems of resource management in heterogeneous networks, dynamic adaptability and configuration of component-based distributed applications.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115001757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868670
S. Adabala, N. Kapadia, J. Fortes
This paper outlines the issues that must be addressed in order to allow cluster management systems such as Condor, DQS (Distributed Queueing Service) and PBS (Portable Batch System) to be transparently used via a wide-area network computing system such as PUNCH (Purdue University Network Computing Hubs).
{"title":"Interfacing wide-area network computing and cluster management software: Condor, DQS and PBS via PUNCH","authors":"S. Adabala, N. Kapadia, J. Fortes","doi":"10.1109/HPDC.2000.868670","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868670","url":null,"abstract":"This paper outlines the issues that must be addressed in order to allow cluster management systems such as Condor, DQS (Distributed Queueing Service) and PBS (Portable Batch System) to be transparently used via a wide-area network computing system such as PUNCH (Purdue University Network Computing Hubs).","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"60 26","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120816548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868664
R. D. Burris, M. Gleicher, H. Holmes, D. Million, S. R. White
As computers become more capable, researchers of all types are finding it necessary to store massive quantities of data generated by simulations or experiments and to retrieve them at high rate for analysis or visualization. Strong needs have arisen for storage systems tuned for particular needs; significant improvements in storage speed and access control; optimized wide area network bulk transfers; utilization of new media and new types of storage devices; and development, testing, and use of user-written storage applications. The Oak Ridge National Laboratory (ORNL) and the National Energy Research Scientific Computing Center (NERSC) have formed a wide-area distributed testbed, entitled "Probe"-, to support challenging storage-related studies.
{"title":"Probe - a distributed storage testbed","authors":"R. D. Burris, M. Gleicher, H. Holmes, D. Million, S. R. White","doi":"10.1109/HPDC.2000.868664","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868664","url":null,"abstract":"As computers become more capable, researchers of all types are finding it necessary to store massive quantities of data generated by simulations or experiments and to retrieve them at high rate for analysis or visualization. Strong needs have arisen for storage systems tuned for particular needs; significant improvements in storage speed and access control; optimized wide area network bulk transfers; utilization of new media and new types of storage devices; and development, testing, and use of user-written storage applications. The Oak Ridge National Laboratory (ORNL) and the National Energy Research Scientific Computing Center (NERSC) have formed a wide-area distributed testbed, entitled \"Probe\"-, to support challenging storage-related studies.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114260610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868643
C. Kurmann, Michael Müller, F. Rauch, T. Stricker
Cluster platforms offer good computational performance, but they still cannot utilize the potential of Gbit/s communication technology. While the speed of the Ethernet has grown to 1 Gbit/s, the functionality and the architectural support in the network interfaces has remained the same for more than a decade, so that the memory system becomes a limiting factor. To sustain the raw network speed in applications, a "zero-copy" network interface architecture would be required, but, for all widely used stacks, a last copy is required for the (de)fragmentation of the transferred network packets, since Ethernet packets are smaller than a page size. Correctly defragmenting packets of various communication protocols in hardware is an extremely complex task. We therefore consider a speculative defragmentation technique that can eliminate the last defragmenting copy operation in zero-copy TCP/IP stacks on existing hardware. The payload of fragmented packets is separated from the headers and stored in a memory page that can be mapped directly to its final destination in user memory. To evaluate our ideas, we integrated a network interface driver with speculative defragmentation into an existing protocol stack and added well-known page remapping and fast buffer strategies. Measurements indicate that we can improve the performance for a Gigabit Ethernet over a standard Linux 2.2 TCP/IP stack by a factor of 1.5-2 for uninterrupted burst transfers. Furthermore, our study demonstrates good speculation success rates for a database and a scientific application code on a cluster of PCs.
{"title":"Speculative defragmentation - a technique to improve the communication software efficiency for Gigabit Ethernet","authors":"C. Kurmann, Michael Müller, F. Rauch, T. Stricker","doi":"10.1109/HPDC.2000.868643","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868643","url":null,"abstract":"Cluster platforms offer good computational performance, but they still cannot utilize the potential of Gbit/s communication technology. While the speed of the Ethernet has grown to 1 Gbit/s, the functionality and the architectural support in the network interfaces has remained the same for more than a decade, so that the memory system becomes a limiting factor. To sustain the raw network speed in applications, a \"zero-copy\" network interface architecture would be required, but, for all widely used stacks, a last copy is required for the (de)fragmentation of the transferred network packets, since Ethernet packets are smaller than a page size. Correctly defragmenting packets of various communication protocols in hardware is an extremely complex task. We therefore consider a speculative defragmentation technique that can eliminate the last defragmenting copy operation in zero-copy TCP/IP stacks on existing hardware. The payload of fragmented packets is separated from the headers and stored in a memory page that can be mapped directly to its final destination in user memory. To evaluate our ideas, we integrated a network interface driver with speculative defragmentation into an existing protocol stack and added well-known page remapping and fast buffer strategies. Measurements indicate that we can improve the performance for a Gigabit Ethernet over a standard Linux 2.2 TCP/IP stack by a factor of 1.5-2 for uninterrupted burst transfers. Furthermore, our study demonstrates good speculation success rates for a database and a scientific application code on a cluster of PCs.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132565013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868661
Hans-Ulrich Heiß, C. Rose, P. Navaux
Current processor allocation techniques for highly parallel systems are based on centralized front-end based algorithms. As a result, the applied strategies are restricted to static allocation, low parallelism and weak fault tolerance. To lift these restrictions, we are investigating a distributed approach to the processor allocation problem in large distributed memory machines. A contiguous and a noncontiguous version of a distributed dynamic processor allocation strategy are proposed and studied. Simulations compare the performance of the proposed strategies with that of well-known centralized algorithms. We also present the results of experiments on a Simens hpcline Primergy Server with 96 nodes that show distributed allocation is feasible with current technologies.
{"title":"Distributed processor allocation in large PC clusters","authors":"Hans-Ulrich Heiß, C. Rose, P. Navaux","doi":"10.1109/HPDC.2000.868661","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868661","url":null,"abstract":"Current processor allocation techniques for highly parallel systems are based on centralized front-end based algorithms. As a result, the applied strategies are restricted to static allocation, low parallelism and weak fault tolerance. To lift these restrictions, we are investigating a distributed approach to the processor allocation problem in large distributed memory machines. A contiguous and a noncontiguous version of a distributed dynamic processor allocation strategy are proposed and studied. Simulations compare the performance of the proposed strategies with that of well-known centralized algorithms. We also present the results of experiments on a Simens hpcline Primergy Server with 96 nodes that show distributed allocation is feasible with current technologies.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122390235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868633
Jean-Pierre Goux, Sanjeev Kulkarni, Jeff T. Linderoth, Michael Yoder
Describes MW (Master-Worker) - a software framework that allows users to quickly and easily parallelize scientific computations using the master-worker paradigm on the Computational Grid. MW provides both a "top-level" interface to application software and a "bottom-level" interface to existing Grid computing toolkits. Both interfaces are briefly described. We conclude with a case study, where the necessary Grid services are provided by the Condor high-throughput computing system, and the MW-enabled application code is used to solve a combinatorial optimization problem of unprecedented complexity.
{"title":"An enabling framework for master-worker applications on the Computational Grid","authors":"Jean-Pierre Goux, Sanjeev Kulkarni, Jeff T. Linderoth, Michael Yoder","doi":"10.1109/HPDC.2000.868633","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868633","url":null,"abstract":"Describes MW (Master-Worker) - a software framework that allows users to quickly and easily parallelize scientific computations using the master-worker paradigm on the Computational Grid. MW provides both a \"top-level\" interface to application software and a \"bottom-level\" interface to existing Grid computing toolkits. Both interfaces are briefly described. We conclude with a case study, where the necessary Grid services are provided by the Condor high-throughput computing system, and the MW-enabled application code is used to solve a combinatorial optimization problem of unprecedented complexity.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127775981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868644
S. Sumimoto, H. Tezuka, A. Hori, H. Harada, Toshiyuki Takahashi, Y. Ishikawa
Proposes a scheme to realize a high-performance communication facility using a commodity network. This scheme does not require any special hardware or hardware-specific device drivers in order to adapt to many kinds of network interface cards (NICs). In this scheme, a reliable lightweight network protocol is handled directly on a data link layer called by a network device driver. An interrupt reaping technique is proposed to eliminate the hardware interrupt overhead when an application waits for a message. PM/Ethernet, an instance of the scheme, is implemented on Linux with minimal modification to the Linux kernel, and existing network device drivers are used without any modification. Using Pentium III 500-MHz PCs on Packet Engine's G-NIC II Gigabit Ethernet NIC, it achieves 77.5 MB/s bandwidth and 37.6 /spl mu/s round-trip time latency compared to that of TCP/IP, which achieves 46.7 MB/s bandwidth and 89.6 /spl mu/s round-trip time latency. The NAS parallel benchmark IS results show that MPI on PM/Ethernet achieves 75% better performance than MPI on TCP/IP and is 7.8% slower than that of MPI on Myrinet PM.
{"title":"High performance communication using a commodity network for cluster systems","authors":"S. Sumimoto, H. Tezuka, A. Hori, H. Harada, Toshiyuki Takahashi, Y. Ishikawa","doi":"10.1109/HPDC.2000.868644","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868644","url":null,"abstract":"Proposes a scheme to realize a high-performance communication facility using a commodity network. This scheme does not require any special hardware or hardware-specific device drivers in order to adapt to many kinds of network interface cards (NICs). In this scheme, a reliable lightweight network protocol is handled directly on a data link layer called by a network device driver. An interrupt reaping technique is proposed to eliminate the hardware interrupt overhead when an application waits for a message. PM/Ethernet, an instance of the scheme, is implemented on Linux with minimal modification to the Linux kernel, and existing network device drivers are used without any modification. Using Pentium III 500-MHz PCs on Packet Engine's G-NIC II Gigabit Ethernet NIC, it achieves 77.5 MB/s bandwidth and 37.6 /spl mu/s round-trip time latency compared to that of TCP/IP, which achieves 46.7 MB/s bandwidth and 89.6 /spl mu/s round-trip time latency. The NAS parallel benchmark IS results show that MPI on PM/Ethernet achieves 75% better performance than MPI on TCP/IP and is 7.8% slower than that of MPI on Myrinet PM.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128092352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}