G. Laszewski, A. Younge, Xi He, G. Mahinthakumar, Lizhe Wang
In recent years the power of Grid computing has grown exponentially through the development of advanced middleware systems. While usage has increased, the penetration of Grid computing in the scientific community has been less than expected by some. This is due to a steep learning curve and high entry barrier that limit the use of Grid computing and advanced cyberinfrastructure. In order for the scientists to focus on actual scientific tasks, specialized tools and services need to be developed to ease the integration of complex middleware. Our solution is Cyberaide Shell, an advanced but simple to use systemshell which provides access to the powerful cyberinfrastructure available today. Cyberaide Shell provides a dynamic interface that allows access to complex cyberinfrastructure in an easy and intuitive fashion on an ad-hoc basis. This is accomplished by abstracting the complexities of resource, task, and application management through a scriptable command line interface. Through a service integration mechanism, the shell’s functionality is exposed to a wide variety of frameworks and programming languages. Cyberaide Shell includes specialized experiment management and workflow commands that, with the scriptable nature of a shell, provide a set of services which where previously unavailable. The usability of Cyberaide Shell is demonstrated using a Water Threat Management application deployed on the TeraGrid.
{"title":"Experiment and Workflow Management Using Cyberaide Shell","authors":"G. Laszewski, A. Younge, Xi He, G. Mahinthakumar, Lizhe Wang","doi":"10.1109/CCGRID.2009.66","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.66","url":null,"abstract":"In recent years the power of Grid computing has grown exponentially through the development of advanced middleware systems. While usage has increased, the penetration of Grid computing in the scientific community has been less than expected by some. This is due to a steep learning curve and high entry barrier that limit the use of Grid computing and advanced cyberinfrastructure. In order for the scientists to focus on actual scientific tasks, specialized tools and services need to be developed to ease the integration of complex middleware. Our solution is Cyberaide Shell, an advanced but simple to use systemshell which provides access to the powerful cyberinfrastructure available today. Cyberaide Shell provides a dynamic interface that allows access to complex cyberinfrastructure in an easy and intuitive fashion on an ad-hoc basis. This is accomplished by abstracting the complexities of resource, task, and application management through a scriptable command line interface. Through a service integration mechanism, the shell’s functionality is exposed to a wide variety of frameworks and programming languages. Cyberaide Shell includes specialized experiment management and workflow commands that, with the scriptable nature of a shell, provide a set of services which where previously unavailable. The usability of Cyberaide Shell is demonstrated using a Water Threat Management application deployed on the TeraGrid.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127910120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe our experiences from implementing and integrating a new job scheduling algorithm in the gLite Grid middleware and present experimental results that compare it to the existing gLite scheduling algorithms. It is the first time that gLite scheduling algorithms are put under test and compared with a new algorithm under the same conditions. We describe the problems that were encountered and solved, going from theory and simulations to practice and the actual implementation of our scheduling algorithm. In this work we also describe the steps one needs to follow in order to develop and test a new scheduling algorithm in gLite. We present the methodology followed and the testbed that was set up for the comparisons. Our research sheds light on some of the problems of the existing gLite scheduling algorithms and makes clear the need for the development of new.
{"title":"Developing Scheduling Policies in gLite Middleware","authors":"A. Kretsis, P. Kokkinos, Emmanouel Varvarigos","doi":"10.1109/CCGRID.2009.54","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.54","url":null,"abstract":"We describe our experiences from implementing and integrating a new job scheduling algorithm in the gLite Grid middleware and present experimental results that compare it to the existing gLite scheduling algorithms. It is the first time that gLite scheduling algorithms are put under test and compared with a new algorithm under the same conditions. We describe the problems that were encountered and solved, going from theory and simulations to practice and the actual implementation of our scheduling algorithm. In this work we also describe the steps one needs to follow in order to develop and test a new scheduling algorithm in gLite. We present the methodology followed and the testbed that was set up for the comparisons. Our research sheds light on some of the problems of the existing gLite scheduling algorithms and makes clear the need for the development of new.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127544409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher Miceli, M. Miceli, S. Jha, Hartmut Kaiser, André Merzky
MapReduce has emerged as an important data-parallel programming model for data-intensive computing – for Clouds and Grids. However most if not all implementations of MapReduce are coupled to a specific infrastructure. SAGA is a high-level programming interface which provides the ability to create distributed applications in an infrastructure independent way. In this paper, we show how MapReduce has been implemented using SAGA and demonstrate its interoperability across different distributed platforms – Grids, Cloud-like infrastructure and Clouds. We discuss the advantages of programmatically developing MapReduce using SAGA, by demonstrating that the SAGA-based implementation is infrastructure independent whilst still providing control over the deployment, distribution and runtime decomposition. The ability to control the distribution and placement of the computation units (workers) is critical in order to implement the ability to move computational work to the data. This is required to keep data network transfer low and in the case of commercial Clouds the monetary cost of computing the solution low. Using data-sets of size up to 10GB, and upto 10 workers, we provide detailed performance analysis of the SAGA-MapReduce implementation, and show how controllingthe distribution of computation and the payload per worker helps enhance performance.
{"title":"Programming Abstractions for Data Intensive Computing on Clouds and Grids","authors":"Christopher Miceli, M. Miceli, S. Jha, Hartmut Kaiser, André Merzky","doi":"10.1109/CCGRID.2009.87","DOIUrl":"https://doi.org/10.1109/CCGRID.2009.87","url":null,"abstract":"MapReduce has emerged as an important data-parallel programming model for data-intensive computing – for Clouds and Grids. However most if not all implementations of MapReduce are coupled to a specific infrastructure. SAGA is a high-level programming interface which provides the ability to create distributed applications in an infrastructure independent way. In this paper, we show how MapReduce has been implemented using SAGA and demonstrate its interoperability across different distributed platforms – Grids, Cloud-like infrastructure and Clouds. We discuss the advantages of programmatically developing MapReduce using SAGA, by demonstrating that the SAGA-based implementation is infrastructure independent whilst still providing control over the deployment, distribution and runtime decomposition. The ability to control the distribution and placement of the computation units (workers) is critical in order to implement the ability to move computational work to the data. This is required to keep data network transfer low and in the case of commercial Clouds the monetary cost of computing the solution low. Using data-sets of size up to 10GB, and upto 10 workers, we provide detailed performance analysis of the SAGA-MapReduce implementation, and show how controllingthe distribution of computation and the payload per worker helps enhance performance.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129450056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}