Data intensive workflows process and generate large amounts of data. Strategies employed to stage data in and out of compute resources can often have a significant impact on the overall execution of a workflow. We study the relationships between data placement services that perform the staging and workflow managers that control the release of computational jobs. We describe a framework that classifies data staging strategies into decoupled, loosely-coupled and tightly-coupled modes, based on the degree of their interaction with the workflow manager. We present the results of simulation studies that investigate the effect of decoupled, loosely-coupled and tightly-coupled data staging strategies on synthetic workflows resembling those from real world scientific applications.
{"title":"Data Staging Strategies and Their Impact on the Execution of Scientific Workflows","authors":"S. Bharathi, A. Chervenak","doi":"10.1145/1552280.1592459","DOIUrl":"https://doi.org/10.1145/1552280.1592459","url":null,"abstract":"Data intensive workflows process and generate large amounts of data. Strategies employed to stage data in and out of compute resources can often have a significant impact on the overall execution of a workflow. We study the relationships between data placement services that perform the staging and workflow managers that control the release of computational jobs. We describe a framework that classifies data staging strategies into decoupled, loosely-coupled and tightly-coupled modes, based on the degree of their interaction with the workflow manager. We present the results of simulation studies that investigate the effect of decoupled, loosely-coupled and tightly-coupled data staging strategies on synthetic workflows resembling those from real world scientific applications.","PeriodicalId":437406,"journal":{"name":"Proceedings of the second international workshop on Data-aware distributed computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126463210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}