2009年IEEE集群计算国际会议与研讨会

Proceedings. IEEE International Conference on Cluster Computing Pub Date : 2009-08-01 DOI:10.1109/CLUSTR.2009.5289149

S. Loebman, D. Nunley, YongChul Kwon, B. Howe, M. Balazinska, J. Gardner

{"title":"2009年IEEE集群计算国际会议与研讨会","authors":"S. Loebman, D. Nunley, YongChul Kwon, B. Howe, M. Balazinska, J. Gardner","doi":"10.1109/CLUSTR.2009.5289149","DOIUrl":null,"url":null,"abstract":"As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel database management systems (DBMSs) and massive-scale data processing systems such as MapReduce hold promise to address this challenge. However, since these systems have not been expressly designed for scientific applications, their efficacy in this domain has not been thoroughly tested. In this paper, we study the performance of these engines in one specific domain: massive astrophysical simulations. We develop a use case that comprises five representative queries. We implement this use case in one distributed DBMS and in the Pig/Hadoop system. We compare the performance of the tools to each other and to hand-written IDL scripts. We find that certain representative analyses are easy to express in each engine's highlevel language and both systems provide competitive performance and improved scalability relative to current IDL-based methods.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":"50 1","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"2009 IEEE International Conference on Cluster Computing and Workshops\",\"authors\":\"S. Loebman, D. Nunley, YongChul Kwon, B. Howe, M. Balazinska, J. Gardner\",\"doi\":\"10.1109/CLUSTR.2009.5289149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel database management systems (DBMSs) and massive-scale data processing systems such as MapReduce hold promise to address this challenge. However, since these systems have not been expressly designed for scientific applications, their efficacy in this domain has not been thoroughly tested. In this paper, we study the performance of these engines in one specific domain: massive astrophysical simulations. We develop a use case that comprises five representative queries. We implement this use case in one distributed DBMS and in the Pig/Hadoop system. We compare the performance of the tools to each other and to hand-written IDL scripts. We find that certain representative analyses are easy to express in each engine's highlevel language and both systems provide competitive performance and improved scalability relative to current IDL-based methods.\",\"PeriodicalId\":92128,\"journal\":{\"name\":\"Proceedings. IEEE International Conference on Cluster Computing\",\"volume\":\"50 1\",\"pages\":\"1-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Conference on Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLUSTR.2009.5289149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2009.5289149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

2009 IEEE International Conference on Cluster Computing and Workshops

As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel database management systems (DBMSs) and massive-scale data processing systems such as MapReduce hold promise to address this challenge. However, since these systems have not been expressly designed for scientific applications, their efficacy in this domain has not been thoroughly tested. In this paper, we study the performance of these engines in one specific domain: massive astrophysical simulations. We develop a use case that comprises five representative queries. We implement this use case in one distributed DBMS and in the Pig/Hadoop system. We compare the performance of the tools to each other and to hand-written IDL scripts. We find that certain representative analyses are easy to express in each engine's highlevel language and both systems provide competitive performance and improved scalability relative to current IDL-based methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. IEEE International Conference on Cluster Computing

自引率

0.00%

发文量

期刊最新文献

Parallel processing of spatial batch-queries using xBR+-trees in solid-state drives Predicting the Energy-Consumption of MPI Applications at Scale Using Only a Single Node Parallel and Efficient Sensitivity Analysis of Microscopy Image Segmentation Workflows in Hybrid Systems. FTS 2016 Workshop Keynote Speech Letter from the general chair