Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029920
Wilbur R. Johnson
Information services are an integral part of the grid architecture. It is the foundation of how resources are defined and their state known. More importantly, the user of the Grid gets a perspective of what a grid looks like, how it performs and what capabilities it has from information services. The Accelerated Strategic Computing Initiative (ASCI) has designed and deployed a set of grid services within the context of the ASCI program. We deploy information services by augmenting the Globus toolkit in order to meet the unique aspects of the ASCI grid. We describe the decisions made and processes developed to run a grid information service in the ASCI grid.
{"title":"Design and implementation of secured information services for the ASCI grid","authors":"Wilbur R. Johnson","doi":"10.1109/HPDC.2002.1029920","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029920","url":null,"abstract":"Information services are an integral part of the grid architecture. It is the foundation of how resources are defined and their state known. More importantly, the user of the Grid gets a perspective of what a grid looks like, how it performs and what capabilities it has from information services. The Accelerated Strategic Computing Initiative (ASCI) has designed and deployed a set of grid services within the context of the ASCI program. We deploy information services by augmenting the Globus toolkit in order to meet the unique aspects of the ASCI grid. We describe the decisions made and processes developed to run a grid information service in the ASCI grid.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116274016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029918
Sudharshan S. Vazhkudai, J. Schopf
The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.
{"title":"Predicting sporadic grid data transfers","authors":"Sudharshan S. Vazhkudai, J. Schopf","doi":"10.1109/HPDC.2002.1029918","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029918","url":null,"abstract":"The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126508168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029909
V. Sunderam, Dawid Kurzyniec
A novel component-based, service-oriented framework for distributed metacomputing is described. Adopting a provider-centric view of resource sharing, this project emphasizes lightweight software infrastructures that maintain a minimal state, and interface to current and emerging distributed computing standards. Resource owners host a software backplane onto which owners, clients, or third-party, resellers may load components or component-suites that deliver value added services without compromising owner security or control. Standards-based descriptions of services facilitate publication and discovery via established schemes. The architecture of the container framework, design of components, security and access control schemes, and preliminary experiences are described in this paper.
{"title":"Lightweight self-organizing frameworks for metacomputing","authors":"V. Sunderam, Dawid Kurzyniec","doi":"10.1109/HPDC.2002.1029909","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029909","url":null,"abstract":"A novel component-based, service-oriented framework for distributed metacomputing is described. Adopting a provider-centric view of resource sharing, this project emphasizes lightweight software infrastructures that maintain a minimal state, and interface to current and emerging distributed computing standards. Resource owners host a software backplane onto which owners, clients, or third-party, resellers may load components or component-suites that deliver value added services without compromising owner security or control. Standards-based descriptions of services facilitate publication and discovery via established schemes. The architecture of the container framework, design of components, security and access control schemes, and preliminary experiences are described in this paper.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"os-51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127843282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029936
Vijay Subramani, R. Kettimuthu, Srividya Srinivasan, P. Sadayappan
Even though middleware support for grid computing has been the subject of extensive research, scheduling policies for the grid context have not been much studied. In addition to processor utilization, it is important to consider the response times of jobs in evaluating the performance of grid scheduling strategies. In this paper we propose distributed scheduling algorithms that use multiple simultaneous requests at different sites. Trace-based simulations show that the use of multiple simultaneous requests provides significant performance benefits. We also show how this scheme can be adapted to provide priority to local jobs, without much loss of performance.
{"title":"Distributed job scheduling on computational Grids using multiple simultaneous requests","authors":"Vijay Subramani, R. Kettimuthu, Srividya Srinivasan, P. Sadayappan","doi":"10.1109/HPDC.2002.1029936","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029936","url":null,"abstract":"Even though middleware support for grid computing has been the subject of extensive research, scheduling policies for the grid context have not been much studied. In addition to processor utilization, it is important to consider the response times of jobs in evaluating the performance of grid scheduling strategies. In this paper we propose distributed scheduling algorithms that use multiple simultaneous requests at different sites. Trace-based simulations show that the use of multiple simultaneous requests provides significant performance benefits. We also show how this scheme can be adapted to provide priority to local jobs, without much loss of performance.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131220824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029919
D. Thain, M. Livny
Error propagation is a central problem in grid computing. We re-learned this while adding a Java feature to the Condor computational grid. Our initial experience with the system was negative, due to the large number of new ways in which the system could fail. To reason about this problem, we developed a theory of error propagation. Central to our theory is the concept of an error's scope, defined as the portion of a system that it invalidates. With this theory in hand, we recognized that the expanded system did not properly consider the scope of errors it discovered. We modified the system according to our theory, and succeeded in making it a more robust platform for distributed computing.
{"title":"Error scope on a computational grid: theory and practice","authors":"D. Thain, M. Livny","doi":"10.1109/HPDC.2002.1029919","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029919","url":null,"abstract":"Error propagation is a central problem in grid computing. We re-learned this while adding a Java feature to the Condor computational grid. Our initial experience with the system was negative, due to the large number of new ways in which the system could fail. To reason about this problem, we developed a theory of error propagation. Central to our theory is the concept of an error's scope, defined as the portion of a system that it invalidates. With this theory in hand, we recognized that the expanded system did not properly consider the scope of errors it discovered. We modified the system according to our theory, and succeeded in making it a more robust platform for distributed computing.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128728048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029928
M. Bettencourt
An implementation of a distributed model coupling framework is documented. This framework provides the infrastructure for a data-flow approach for solving the problem of distributed numerical models sharing coupling information. There exists a centralized server which stores coupling information such as surface fluxes. This information is then passed to client applications (numerical models) through a series of filters. These filters are used to transform the information into a ready-to-use form by the model and are specific to the coupling process being performed. CORBA is used for all the communication between processes. Results are given for two test cases, a strong tropical event and a rip current calculation over a barred beach. Results for the rip current example are compared to the traditional approach of file based coupling approach showing a 50% decrease in execution time with equivalent results.
{"title":"Distributed model coupling framework","authors":"M. Bettencourt","doi":"10.1109/HPDC.2002.1029928","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029928","url":null,"abstract":"An implementation of a distributed model coupling framework is documented. This framework provides the infrastructure for a data-flow approach for solving the problem of distributed numerical models sharing coupling information. There exists a centralized server which stores coupling information such as surface fluxes. This information is then passed to client applications (numerical models) through a series of filters. These filters are used to transform the information into a ready-to-use form by the model and are specific to the coupling process being performed. CORBA is used for all the communication between processes. Results are given for two test cases, a strong tropical event and a rip current calculation over a barred beach. Results for the rip current example are compared to the traditional approach of file based coupling approach showing a 50% decrease in execution time with equivalent results.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124190983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029905
Xiaohui Gu, K. Nahrstedt
Peer-to-peer (P2P) computing grids consist of peer nodes that communicate directly among themselves through wide-area networks and can act as both clients and servers. These systems have drawn much research attention since they promote Internet-scale resource and service sharing without any administration cost or centralized infrastructure support. However aggregating different application services into a high-performance distributed application delivery in such systems is challenging due to the presence of dynamic performance information, arbitrary peer arrivals/departures, and systems' scalability requirement. In this paper we propose a scalable QoS-aware service aggregation model to address the challenges. The model includes two tiers: (1) on-demand service composition tier which is responsible for choosing and composing different application services into a service path satisfying the user's quality requirements; and (2) dynamic peer selection tier, which decides the specific peers where the chosen services are actually instantiated based on the dynamic, composite and distributed performance information. The model is designed and implemented in a fully distributed and self-organizing fashion. Conducting extensive simulations of a large-scale P2P system (10/sup 4/ peers), we show that our proposed model and algorithms achieve better performance than several common heuristic algorithms.
{"title":"A scalable QoS-aware service aggregation model for peer-to-peer computing grids","authors":"Xiaohui Gu, K. Nahrstedt","doi":"10.1109/HPDC.2002.1029905","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029905","url":null,"abstract":"Peer-to-peer (P2P) computing grids consist of peer nodes that communicate directly among themselves through wide-area networks and can act as both clients and servers. These systems have drawn much research attention since they promote Internet-scale resource and service sharing without any administration cost or centralized infrastructure support. However aggregating different application services into a high-performance distributed application delivery in such systems is challenging due to the presence of dynamic performance information, arbitrary peer arrivals/departures, and systems' scalability requirement. In this paper we propose a scalable QoS-aware service aggregation model to address the challenges. The model includes two tiers: (1) on-demand service composition tier which is responsible for choosing and composing different application services into a service path satisfying the user's quality requirements; and (2) dynamic peer selection tier, which decides the specific peers where the chosen services are actually instantiated based on the dynamic, composite and distributed performance information. The model is designed and implemented in a fully distributed and self-organizing fashion. Conducting extensive simulations of a large-scale P2P system (10/sup 4/ peers), we show that our proposed model and algorithms achieve better performance than several common heuristic algorithms.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"48 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130359684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029946
M. Ghanem, Yike Guo, A. Rowe, P. Wendel
Discovery Net is an application layer for providing grid-based knowledge discovery services. These services allow scientists to create and manage complex knowledge discovery workflows that integrate data and analysis routines provided as remote services. They also allow scientists to store, share and execute these workflows as well as publish them as new services. Discovery Net provides a higher level of abstraction of the Grid for knowledge discovery activities, thus separating the end-users from resource management issues already handled by existing and emerging standards.
{"title":"Grid-based knowledge discovery services for high throughput informatics","authors":"M. Ghanem, Yike Guo, A. Rowe, P. Wendel","doi":"10.1109/HPDC.2002.1029946","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029946","url":null,"abstract":"Discovery Net is an application layer for providing grid-based knowledge discovery services. These services allow scientists to create and manage complex knowledge discovery workflows that integrate data and analysis routines provided as remote services. They also allow scientists to store, share and execute these workflows as well as publish them as new services. Discovery Net provides a higher level of abstraction of the Grid for knowledge discovery activities, thus separating the end-users from resource management issues already handled by existing and emerging standards.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133596999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029932
Yinglian Xie, D. O'Hallaron, M. Reiter
This paper presents the design, implementation and evaluation of Mingle, a secure distributed search system. Each participating host runs a Mingle server, which maintains an inverted index of the local file system. Users initiate peer-to-peer keyword searches by typing keywords to lightweight Mingle clients. Central to Mingle are its access control mechanisms and its insistence on user convenience. For access control, we introduce the idea of access-right mapping, which provides a convenient way for file owners to specify access permissions. Access control is supported through a single sign-on mechanism that allows users to conveniently establish their identity to Mingle servers, such that subsequent authentication occurs automatically, with minimal manual involvement. Preliminary performance evaluation suggests that Mingle is both feasible and scalable.
{"title":"A secure distributed search system","authors":"Yinglian Xie, D. O'Hallaron, M. Reiter","doi":"10.1109/HPDC.2002.1029932","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029932","url":null,"abstract":"This paper presents the design, implementation and evaluation of Mingle, a secure distributed search system. Each participating host runs a Mingle server, which maintains an inverted index of the local file system. Users initiate peer-to-peer keyword searches by typing keywords to lightweight Mingle clients. Central to Mingle are its access control mechanisms and its insistence on user convenience. For access control, we introduce the idea of access-right mapping, which provides a convenient way for file owners to specify access permissions. Access control is supported through a single sign-on mechanism that allows users to conveniently establish their identity to Mingle servers, such that subsequent authentication occurs automatically, with minimal manual involvement. Preliminary performance evaluation suggests that Mingle is both feasible and scalable.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130138787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029911
M. Schulz
Despite the large I/O capabilities in modern cluster architectures with local disks on each node, applications mostly are not enabled to fully exploit them. This is especially problematic for data intensive applications which often suffer from low I/O performance. As one solution for this problem, a distribution I/O management (DIOM) system has been developed to manage a transparent distribution of data across cluster nodes and to then allow applications to access this data purely from local disks. In order to be effective, however, this distribution process requires semantic information about both the application and the input data. This work therefore extends DIOM to include independent specifications for both data formats and application I/O patterns and thereby decouples them. This work is driven by an application from nuclear medical imaging, the reconstruction of PET images, for which DIOM has proven to be an adequate solution enabling truly scalable I/O and thereby improving the overall application performance.
{"title":"Using semantic information to guide efficient parallel I/O on clusters","authors":"M. Schulz","doi":"10.1109/HPDC.2002.1029911","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029911","url":null,"abstract":"Despite the large I/O capabilities in modern cluster architectures with local disks on each node, applications mostly are not enabled to fully exploit them. This is especially problematic for data intensive applications which often suffer from low I/O performance. As one solution for this problem, a distribution I/O management (DIOM) system has been developed to manage a transparent distribution of data across cluster nodes and to then allow applications to access this data purely from local disks. In order to be effective, however, this distribution process requires semantic information about both the application and the input data. This work therefore extends DIOM to include independent specifications for both data formats and application I/O patterns and thereby decouples them. This work is driven by an application from nuclear medical imaging, the reconstruction of PET images, for which DIOM has proven to be an adequate solution enabling truly scalable I/O and thereby improving the overall application performance.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115072690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}