Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029921
Chris M. Kenyon, G. Cheliotis
Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and efficient usage of such infrastructure in a commercial context where several entities compete for shared resources. This has long been resolved for conventional utility resources such as gas and electricity through commoditization, a variety of market designs, customization, and decision support for the resulting portfolios of assets and commitments. The paper reviews the state of Grid commercialization and compares it to the commercialization of conventional resources. We draw specific lessons for commercialized Grids and detail them as architecture requirements at each level of the architecture stack. We provide an example to illustrate the benefits of commercialized resources in terms of the financial clarity it brings to decisions for different user groups, namely application users and IT managers.
{"title":"Architecture requirements for commercializing Grid resources","authors":"Chris M. Kenyon, G. Cheliotis","doi":"10.1109/HPDC.2002.1029921","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029921","url":null,"abstract":"Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and efficient usage of such infrastructure in a commercial context where several entities compete for shared resources. This has long been resolved for conventional utility resources such as gas and electricity through commoditization, a variety of market designs, customization, and decision support for the resulting portfolios of assets and commitments. The paper reviews the state of Grid commercialization and compares it to the commercialization of conventional resources. We draw specific lessons for commercialized Grids and detail them as architecture requirements at each level of the architecture stack. We provide an example to illustrate the benefits of commercialized resources in terms of the financial clarity it brings to decisions for different user groups, namely application users and IT managers.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130056007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029915
D. Gunter, B. Tierney, K. Jackson, Jason R. Lee, M. Stoufer
Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to have extremely low overhead, and not affect the system being monitored. In this paper we present a very light-weight instrumentation system that can be dynamically activated to unobtrusively collect and aggregate detailed end-to-end monitoring information from distributed applications. We also show how emerging "web services" can be used to facilitate remote interaction with this system.
{"title":"Dynamic monitoring of high-performance distributed applications","authors":"D. Gunter, B. Tierney, K. Jackson, Jason R. Lee, M. Stoufer","doi":"10.1109/HPDC.2002.1029915","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029915","url":null,"abstract":"Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to have extremely low overhead, and not affect the system being monitored. In this paper we present a very light-weight instrumentation system that can be dynamically activated to unobtrusively collect and aggregate detailed end-to-end monitoring information from distributed applications. We also show how emerging \"web services\" can be used to facilitate remote interaction with this system.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129517352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029940
I. Corey, John R. Johnson, J. Vetter
This study presents a technique that can significantly improve the performance of a distributed application by allowing the application to locally adapt to architectural characteristics of distinct resources in a distributed system. Application performance is sensitive to system architecture-application parameter pairings. In a distributed or Grid enabled application, a single parameter configuration for the whole application will not always be optimal for every participating resource. In particular, some configurations can significantly degrade performance. Furthermore, the behavior of a system may change during the course of the run. The technique described here provides an automated mechanism for run-time adaptation of application parameters to the local system architecture. Using a scaled-down simulation of a Monte Carlo physics code, we demonstrate that this technique can conservatively achieve speedups up to 65% on individual resources and may even provide order of magnitude speedup in the extreme case.
{"title":"Local discovery of system architecture - application parameter sensitivity: an empirical technique for adaptive grid applications","authors":"I. Corey, John R. Johnson, J. Vetter","doi":"10.1109/HPDC.2002.1029940","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029940","url":null,"abstract":"This study presents a technique that can significantly improve the performance of a distributed application by allowing the application to locally adapt to architectural characteristics of distinct resources in a distributed system. Application performance is sensitive to system architecture-application parameter pairings. In a distributed or Grid enabled application, a single parameter configuration for the whole application will not always be optimal for every participating resource. In particular, some configurations can significantly degrade performance. Furthermore, the behavior of a system may change during the course of the run. The technique described here provides an automated mechanism for run-time adaptation of application parameters to the local system architecture. Using a scaled-down simulation of a Monte Carlo physics code, we demonstrate that this technique can conservatively achieve speedups up to 65% on individual resources and may even provide order of magnitude speedup in the extreme case.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"356 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134215168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029922
E. Deelman, C. Kesselman, Gaurang Mehta, L. Meshkat, L. Pearlman, K. Blackburn, P. Ehrens, A. Lazzarini, Roy Williams, S. Koranda
Many Physics experiments today generate large volumes of data. That data is then processed in a variety of ways in order to achieve the understanding of fundamental physical phenomena. The goal of the NSF-funded GriPhyN project (Grid Physics Network) is to enable scientists to seamlessly access data whether it is raw experimental data or a data product which is a result of further processing. GriPhyN provides a new degree of transparency in how data-handling and processing capabilities are integrated to deliver data products to end-users or applications, so that requests for such products are easily mapped into computation and/or data access at multiple locations. GriPhyN refers to the set of all data products available to the user as virtual data. Among the physics applications participating in the project is the Laser Interferometer Gravitational-wave Observatory (LIGO), which is being built to observe the gravitational waves predicted by general relativity. We describe our initial design and prototype of a virtual data Grid for LIGO.
{"title":"GriPhyN and LIGO, building a virtual data Grid for gravitational wave scientists","authors":"E. Deelman, C. Kesselman, Gaurang Mehta, L. Meshkat, L. Pearlman, K. Blackburn, P. Ehrens, A. Lazzarini, Roy Williams, S. Koranda","doi":"10.1109/HPDC.2002.1029922","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029922","url":null,"abstract":"Many Physics experiments today generate large volumes of data. That data is then processed in a variety of ways in order to achieve the understanding of fundamental physical phenomena. The goal of the NSF-funded GriPhyN project (Grid Physics Network) is to enable scientists to seamlessly access data whether it is raw experimental data or a data product which is a result of further processing. GriPhyN provides a new degree of transparency in how data-handling and processing capabilities are integrated to deliver data products to end-users or applications, so that requests for such products are easily mapped into computation and/or data access at multiple locations. GriPhyN refers to the set of all data products available to the user as virtual data. Among the physics applications participating in the project is the Laser Interferometer Gravitational-wave Observatory (LIGO), which is being built to observe the gravitational waves predicted by general relativity. We describe our initial design and prototype of a virtual data Grid for LIGO.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"439 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122887522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-24DOI: 10.1109/HPDC.2002.1029916
Beth Plale
The dQUOB system conceptualization of data streams as database and its SQL interface to data streams is an intuitive way for users to think about their data needs in a large scale application containing hundreds if not thousands of data streams. Experience with dQUOB has shown the need for more aggressive memory management to achieve the scalability we desire. This paper addresses the problem with a two-fold solution. The first one is replacement of the existing first-come first-served scheduling algorithm with an earliest job first algorithm which we demonstrate to yield better average service time. The second one is an introspection algorithm that sets and adapts the sizes of join windows in response to the knowledge acquired at runtime about event rates. In addition to the potential for significant improvements in memory utilization, the algorithm presented here also provides a means by which the user can reason about join window sizes. Wide area measurements demonstrate the adaptive capability required by the introspection technique.
{"title":"Leveraging run time knowledge about event rates to improve memory utilization in wide area data stream filtering","authors":"Beth Plale","doi":"10.1109/HPDC.2002.1029916","DOIUrl":"https://doi.org/10.1109/HPDC.2002.1029916","url":null,"abstract":"The dQUOB system conceptualization of data streams as database and its SQL interface to data streams is an intuitive way for users to think about their data needs in a large scale application containing hundreds if not thousands of data streams. Experience with dQUOB has shown the need for more aggressive memory management to achieve the scalability we desire. This paper addresses the problem with a two-fold solution. The first one is replacement of the existing first-come first-served scheduling algorithm with an earliest job first algorithm which we demonstrate to yield better average service time. The second one is an introspection algorithm that sets and adapts the sizes of join windows in response to the knowledge acquired at runtime about event rates. In addition to the potential for significant improvements in memory utilization, the algorithm presented here also provides a means by which the user can reason about join window sizes. Wide area measurements demonstrate the adaptive capability required by the introspection technique.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130396090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}