Proceedings of the 1st Workshop on Architectures and Systems for Big Data最新文献

英文中文

Automatic task slots assignment in Hadoop MapReduce Hadoop MapReduce自动分配任务槽位

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377982

Kun Wang, B. Tan, Juwei Shi, Bo Yang

In this paper, we address the problem caused by fixed assignment of task slots in Hadoop MapReduce. It is infeasible to manually configure optimal task slots since the characteristics of various workloads are different. We design and implement an automatic control mechanism to dynamically assign task slots based on the resource utilization on each Task Tracker node. The assignment takes the lag period into account. It can improve the cluster-wide resource utilization and avoid contention. Experimental results show that our implementation can dynamically adjust the task slots capacity to the optimal setting in runtime. In some case such as Word Count, our control mechanism outperforms the current Hadoop with optimal task slots configuration found by manual tuning.

在本文中，我们解决了Hadoop MapReduce中任务槽的固定分配问题。由于各种工作负载的特征不同，手动配置最优任务槽是不可行的。我们设计并实现了一种自动控制机制，根据每个任务跟踪器节点的资源利用率动态分配任务槽。分配将滞后期考虑在内。它可以提高集群范围内的资源利用率，避免争用。实验结果表明，我们的实现可以在运行时动态调整任务槽容量到最优设置。在某些情况下，例如Word Count，我们的控制机制通过手动调优找到的最佳任务槽配置优于当前的Hadoop。

引用次数: 16

Myriad: parallel data generation on shared-nothing architectures Myriad:无共享架构上的并行数据生成

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377983

Alexander B. Alexandrov, Berni Schiefer, John Poelman, Stephan Ewen, Thomas Bodner, V. Markl

The need for efficient data generation for the purposes of testing and benchmarking newly developed massively-parallel data processing systems has increased with the emergence of Big Data problems. As synthetic data model specifications evolve over time, the data generator programs implementing these models have to be adapted continuously -- a task that often becomes more tedious as the set of model constraints grows. In this paper we present Myriad - a new parallel data generation toolkit. Data generators created with the toolkit can quickly produce very large datasets in a shared-nothing parallel execution environment, while at the same time preserve with cross-partition dependencies, correlations and distributions in the generated data. In addition, we report on our efforts towards a benchmark suite for large-scale parallel analysis systems that uses Myriad for the generation of OLAP-style relational datasets.

随着大数据问题的出现，为了对新开发的大规模并行数据处理系统进行测试和基准测试，对高效数据生成的需求日益增加。随着合成数据模型规范随着时间的推移而发展，实现这些模型的数据生成器程序必须不断地进行调整——随着模型约束集的增长，这项任务通常会变得更加繁琐。在本文中，我们提出了Myriad -一个新的并行数据生成工具包。使用该工具包创建的数据生成器可以在无共享的并行执行环境中快速生成非常大的数据集，同时在生成的数据中保留跨分区依赖性、相关性和分布。此外，我们报告了我们为大规模并行分析系统的基准套件所做的努力，该系统使用Myriad来生成olap风格的关系数据集。

引用次数: 4

Application-driven energy-efficient architecture explorations for big data 面向大数据的应用驱动节能架构探索

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377984

Xiaoyan Gu, Rui Hou, Ke Zhang, Lixin Zhang, Weiping Wang

Building energy-efficient systems is critical for big data applications. This paper investigates and compares the energy consumption and the execution time of a typical Hadoop-based big data application running on a traditional Xeon-based cluster and an Atom-based (Micro-server) cluster. Our experimental results show that the micro-server platform is more energy-efficient than the Xeon-based platform. Our experimental results also reveal that data compression and decompression accounts for a considerable percentage of the total execution time. More precisely, data compression/decompression occupies 7-11% of the execution time of the map tasks and 37.9-41.2% of the execution time of the reduce tasks. Based on our findings, we demonstrate the necessity of using a heterogeneous architecture for energy-efficient big data processing. The desired architecture takes the advantages of both micro-server processors and hardware compression/decompression accelerators. In addition, we propose a mechanism that enables the accelerators to perform more efficient data compression/decompression.

建立节能系统对大数据应用至关重要。本文研究并比较了一个典型的基于hadoop的大数据应用程序在传统的xeon集群和基于atom(微服务器)集群上运行的能耗和执行时间。实验结果表明，微服务器平台比基于xeon的平台更节能。我们的实验结果还表明，数据压缩和解压缩占总执行时间的相当大的百分比。更准确地说，数据压缩/解压缩占用map任务执行时间的7-11%，reduce任务执行时间的37.9-41.2%。基于我们的研究结果，我们证明了使用异构架构进行节能大数据处理的必要性。所需的体系结构同时利用了微服务器处理器和硬件压缩/解压缩加速器的优势。此外，我们提出了一种机制，使加速器能够执行更有效的数据压缩/解压缩。

引用次数: 9

A collaborative memory system for high-performance and cost-effective clustered architectures 一种用于高性能和低成本集群架构的协同存储系统

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377979

A. Samih, Ren Wang, C. Maciocco, T. Tai, Yan Solihin

With the fast development of highly integrated distributed systems (cluster systems), especially those encapsulated within a single platform [28, 9], designers have to face interesting memory hierarchy design choices that attempt to avoid disk storage swapping. Disk swapping activities slow down application execution drastically. Leveraging remote free memory through Memory Collaboration has demonstrated its cost-effectiveness compared to overprovisioning for peak load requirements. Recent studies propose several ways on accessing the under-utilized remote memory in static system configurations, without detailed exploration on the dynamic memory collaboration. Dynamic collaboration is an important aspect given the run-time memory usage fluctuations in clustered systems. In this paper, we propose an Autonomous Collaborative Memory System (ACMS) that manages memory resources dynamically at run time, to optimize performance, and provide QoS measures for nodes engaging in the system. We implement a prototype realizing the proposed ACMS, experiment with a wide range of real-world applications, and show up to 3x performance speedup compared to a non-collaborative memory system, without perceivable performance impact on nodes that provide memory. Based on our experiments, we conduct detailed analysis on the remote memory access overhead and provide insights for future optimizations.

随着高度集成的分布式系统(集群系统)的快速发展，特别是那些封装在单一平台中的系统[28,9]，设计人员不得不面对有趣的内存层次设计选择，试图避免磁盘存储交换。磁盘交换活动大大降低了应用程序的执行速度。与过度配置峰值负载需求相比，通过内存协作利用远程空闲内存已经证明了它的成本效益。最近的研究提出了几种在静态系统配置中访问未充分利用的远程内存的方法，但没有对动态内存协作进行详细的探索。考虑到集群系统中运行时内存使用的波动，动态协作是一个重要方面。在本文中，我们提出了一个自主协作内存系统(ACMS)，该系统在运行时动态管理内存资源，以优化性能，并为参与系统的节点提供QoS措施。我们实现了一个实现所提出的ACMS的原型，并在广泛的实际应用中进行了实验，与非协作内存系统相比，性能加速高达3倍，而对提供内存的节点没有明显的性能影响。基于我们的实验，我们对远程内存访问开销进行了详细的分析，并为未来的优化提供了见解。

{"title":"A collaborative memory system for high-performance and cost-effective clustered architectures","authors":"A. Samih, Ren Wang, C. Maciocco, T. Tai, Yan Solihin","doi":"10.1145/2377978.2377979","DOIUrl":"https://doi.org/10.1145/2377978.2377979","url":null,"abstract":"With the fast development of highly integrated distributed systems (cluster systems), especially those encapsulated within a single platform [28, 9], designers have to face interesting memory hierarchy design choices that attempt to avoid disk storage swapping. Disk swapping activities slow down application execution drastically. Leveraging remote free memory through Memory Collaboration has demonstrated its cost-effectiveness compared to overprovisioning for peak load requirements. Recent studies propose several ways on accessing the under-utilized remote memory in static system configurations, without detailed exploration on the dynamic memory collaboration. Dynamic collaboration is an important aspect given the run-time memory usage fluctuations in clustered systems. In this paper, we propose an Autonomous Collaborative Memory System (ACMS) that manages memory resources dynamically at run time, to optimize performance, and provide QoS measures for nodes engaging in the system. We implement a prototype realizing the proposed ACMS, experiment with a wide range of real-world applications, and show up to 3x performance speedup compared to a non-collaborative memory system, without perceivable performance impact on nodes that provide memory. Based on our experiments, we conduct detailed analysis on the remote memory access overhead and provide insights for future optimizations.","PeriodicalId":231147,"journal":{"name":"Proceedings of the 1st Workshop on Architectures and Systems for Big Data","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116980206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Evaluation of I/O technologies on a flash-based I/O sub-system for HPC 基于闪存的高性能计算I/O子系统的I/O技术评价

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377980

Pietro Cicotti, Jeffrey W. Bennet, Shawn M. Strande, R. Sinkovits, A. Snavely

To meet the growing demand for high performance computing systems that are capable of processing large datasets, the San Diego Supercomputer Center is deploying Gordon. This system was specifically designed for data intensive workloads and uses flash memory to fill the large latency gap in the memory hierarchy between DRAM and hard disk. In preparation for the deployment of Gordon, we evaluated the performance of multiple remote storage technologies and file systems for use with the flash memory. We find that OCFS and XFS are both superior to PVFS at delivering fast random access to flash. In addition, our tests indicate that the Linux SCSI target framework (TGT) can export flash storage devices with minimal overhead and achieve a large fraction of the theoretical peak I/O performance. Despite the difficulties in fairly comparing I/O solutions due to the many differences in file systems and service implementations, we conclude that OCFS on TGT is a viable option for our system as it provides both excellent performance and a user-friendly shared file system interface. In those instances where a parallel file system is not required, XFS on TGT is a better alternative.

为了满足对能够处理大型数据集的高性能计算系统日益增长的需求，圣地亚哥超级计算机中心正在部署Gordon。该系统是专门为数据密集型工作负载设计的，并使用闪存来填补DRAM和硬盘之间内存层次结构中的巨大延迟差距。在准备部署Gordon时，我们评估了与闪存一起使用的多种远程存储技术和文件系统的性能。我们发现OCFS和XFS在提供快速随机访问闪存方面都优于PVFS。此外，我们的测试表明，Linux SCSI目标框架(TGT)可以以最小的开销导出闪存存储设备，并达到理论峰值I/O性能的很大一部分。尽管由于文件系统和服务实现的许多差异，公平比较I/O解决方案存在困难，但我们得出结论，TGT上的OCFS是我们系统的可行选择，因为它既提供了出色的性能，又提供了用户友好的共享文件系统界面。在不需要并行文件系统的情况下，基于TGT的XFS是更好的选择。

{"title":"Evaluation of I/O technologies on a flash-based I/O sub-system for HPC","authors":"Pietro Cicotti, Jeffrey W. Bennet, Shawn M. Strande, R. Sinkovits, A. Snavely","doi":"10.1145/2377978.2377980","DOIUrl":"https://doi.org/10.1145/2377978.2377980","url":null,"abstract":"To meet the growing demand for high performance computing systems that are capable of processing large datasets, the San Diego Supercomputer Center is deploying Gordon. This system was specifically designed for data intensive workloads and uses flash memory to fill the large latency gap in the memory hierarchy between DRAM and hard disk. In preparation for the deployment of Gordon, we evaluated the performance of multiple remote storage technologies and file systems for use with the flash memory. We find that OCFS and XFS are both superior to PVFS at delivering fast random access to flash. In addition, our tests indicate that the Linux SCSI target framework (TGT) can export flash storage devices with minimal overhead and achieve a large fraction of the theoretical peak I/O performance. Despite the difficulties in fairly comparing I/O solutions due to the many differences in file systems and service implementations, we conclude that OCFS on TGT is a viable option for our system as it provides both excellent performance and a user-friendly shared file system interface. In those instances where a parallel file system is not required, XFS on TGT is a better alternative.","PeriodicalId":231147,"journal":{"name":"Proceedings of the 1st Workshop on Architectures and Systems for Big Data","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134481190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Extending MPI to accelerators 将MPI扩展到加速器

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 2011-10-10 DOI: 10.1145/2377978.2377981

Jeff A. Stuart, P. Balaji, John Douglas Owens

Current trends in computing and system architecture point towards a need for accelerators such as GPUs to have inherent communication capabilities. We review previous and current software libraries that provide pseudo-communication abilities through direct message passing. We show how these libraries are beneficial to the HPC community, but are not forward-thinking enough. We give motivation as to why MPI should be extended to support these accelerators, and provide a road map of achievable milestones to complete such an extension, some of which require advances in hardware and device drivers.

当前计算和系统架构的趋势指向需要像gpu这样具有固有通信能力的加速器。我们回顾了以前和当前通过直接消息传递提供伪通信能力的软件库。我们展示了这些库如何对HPC社区有益，但还不够超前。我们给出了为什么应该扩展MPI以支持这些加速器的动机，并提供了完成此类扩展的可实现里程碑的路线图，其中一些需要硬件和设备驱动程序的进步。

引用次数: 20

Proceedings of the 1st Workshop on Architectures and Systems for Big Data 第一届大数据架构与系统研讨会论文集

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

Pub Date : 1900-01-01 DOI: 10.1145/2377978

引用次数: 2

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 1st Workshop on Architectures and Systems for Big Data

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀