{"title":"BLAST Application with Data-Aware Desktop Grid Middleware","authors":"Haiwu He, G. Fedak, B. Tang, F. Cappello","doi":"10.1109/CCGRID.2009.91","DOIUrl":null,"url":null,"abstract":"There exists numerous Grid middleware to develop and execute programs on the computational Grid, but they still require intensive work from their users. BitDew is made to facilitate the usage of large scale Grid with dynamic, heterogeneous, volatile and highly distributed computing resources for applications that require a huge amount of data processing. Data-intensive applications form an important class of applications for the e-Science community which require secure and coordinated access to large datasets, wide-area transfers and broad distribution ofTeraBytes of data while keeping track of multiple data replicas. In genetic biology, gene sequences comparison and analysis are the most basic routines. With the considerable increase of sequences to analyze, we need more and more computing power as well as efficient solution to manage data. In this work, we investigate the advantages of using a new Desktop Grid middleware BitDew, designed for large scale data management.Our contribution is two-fold: firstly, we introduce a data-driven Master/Slave programming model and we present an implementation of BLAST over BitDew following this model, secondly, we present extensive experimental and simulation results which demonstrate the effectiveness and scalability of our approach. We evaluate the benefit of multi-protocol data distribution to achieve remarkable speedups, we report on the ability to cope with highly volatile environment with relative performance degradation, we show the benefit of data replication in Grid with heterogeneous resource performance and we evaluate the combination of data fault tolerance and data replication when computing on volatileresources.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"17 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2009.91","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
There exists numerous Grid middleware to develop and execute programs on the computational Grid, but they still require intensive work from their users. BitDew is made to facilitate the usage of large scale Grid with dynamic, heterogeneous, volatile and highly distributed computing resources for applications that require a huge amount of data processing. Data-intensive applications form an important class of applications for the e-Science community which require secure and coordinated access to large datasets, wide-area transfers and broad distribution ofTeraBytes of data while keeping track of multiple data replicas. In genetic biology, gene sequences comparison and analysis are the most basic routines. With the considerable increase of sequences to analyze, we need more and more computing power as well as efficient solution to manage data. In this work, we investigate the advantages of using a new Desktop Grid middleware BitDew, designed for large scale data management.Our contribution is two-fold: firstly, we introduce a data-driven Master/Slave programming model and we present an implementation of BLAST over BitDew following this model, secondly, we present extensive experimental and simulation results which demonstrate the effectiveness and scalability of our approach. We evaluate the benefit of multi-protocol data distribution to achieve remarkable speedups, we report on the ability to cope with highly volatile environment with relative performance degradation, we show the benefit of data replication in Grid with heterogeneous resource performance and we evaluate the combination of data fault tolerance and data replication when computing on volatileresources.