{"title":"数据分级策略及其对科学工作流执行的影响","authors":"S. Bharathi, A. Chervenak","doi":"10.1145/1552280.1592459","DOIUrl":null,"url":null,"abstract":"Data intensive workflows process and generate large amounts of data. Strategies employed to stage data in and out of compute resources can often have a significant impact on the overall execution of a workflow. We study the relationships between data placement services that perform the staging and workflow managers that control the release of computational jobs. We describe a framework that classifies data staging strategies into decoupled, loosely-coupled and tightly-coupled modes, based on the degree of their interaction with the workflow manager. We present the results of simulation studies that investigate the effect of decoupled, loosely-coupled and tightly-coupled data staging strategies on synthetic workflows resembling those from real world scientific applications.","PeriodicalId":437406,"journal":{"name":"Proceedings of the second international workshop on Data-aware distributed computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Data Staging Strategies and Their Impact on the Execution of Scientific Workflows\",\"authors\":\"S. Bharathi, A. Chervenak\",\"doi\":\"10.1145/1552280.1592459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data intensive workflows process and generate large amounts of data. Strategies employed to stage data in and out of compute resources can often have a significant impact on the overall execution of a workflow. We study the relationships between data placement services that perform the staging and workflow managers that control the release of computational jobs. We describe a framework that classifies data staging strategies into decoupled, loosely-coupled and tightly-coupled modes, based on the degree of their interaction with the workflow manager. We present the results of simulation studies that investigate the effect of decoupled, loosely-coupled and tightly-coupled data staging strategies on synthetic workflows resembling those from real world scientific applications.\",\"PeriodicalId\":437406,\"journal\":{\"name\":\"Proceedings of the second international workshop on Data-aware distributed computing\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the second international workshop on Data-aware distributed computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1552280.1592459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the second international workshop on Data-aware distributed computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1552280.1592459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Staging Strategies and Their Impact on the Execution of Scientific Workflows
Data intensive workflows process and generate large amounts of data. Strategies employed to stage data in and out of compute resources can often have a significant impact on the overall execution of a workflow. We study the relationships between data placement services that perform the staging and workflow managers that control the release of computational jobs. We describe a framework that classifies data staging strategies into decoupled, loosely-coupled and tightly-coupled modes, based on the degree of their interaction with the workflow manager. We present the results of simulation studies that investigate the effect of decoupled, loosely-coupled and tightly-coupled data staging strategies on synthetic workflows resembling those from real world scientific applications.