{"title":"ZCluster:基于zynq的Hadoop集群","authors":"Zhongduo Lin, P. Chow","doi":"10.1109/FPT.2013.6718411","DOIUrl":null,"url":null,"abstract":"ARM-based servers are garnering increasing interest in big data processing for their low power consumption. However, they are ill-suited for compute-intensive tasks due to their poor processing capability compared to the CPUs used in a traditional server. This paper describes our early efforts to integrate the processing power of the FPGA with the ARM processor inside the Xilinx Zynq SoC. An eight-slave Zynq-based Hadoop cluster is built and a customized hardware accelerator for a standard FIR filter is implemented to demonstrate the effectiveness of hardware acceleration. The Xillybus is used for communication between the ARM processor and the FPGA fabric, achieving a bandwidth of 103MB/s. The Hadoop cluster is proved to be linearly scalable with different input sizes and numbers of slaves. Overall, the cluster achieves a 3.3-fold speedup compared to a native pure software implementation on a single ARM processor and about a 20% improvement compared to an ARM-based cluster without hardware accelerators.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"ZCluster: A Zynq-based Hadoop cluster\",\"authors\":\"Zhongduo Lin, P. Chow\",\"doi\":\"10.1109/FPT.2013.6718411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ARM-based servers are garnering increasing interest in big data processing for their low power consumption. However, they are ill-suited for compute-intensive tasks due to their poor processing capability compared to the CPUs used in a traditional server. This paper describes our early efforts to integrate the processing power of the FPGA with the ARM processor inside the Xilinx Zynq SoC. An eight-slave Zynq-based Hadoop cluster is built and a customized hardware accelerator for a standard FIR filter is implemented to demonstrate the effectiveness of hardware acceleration. The Xillybus is used for communication between the ARM processor and the FPGA fabric, achieving a bandwidth of 103MB/s. The Hadoop cluster is proved to be linearly scalable with different input sizes and numbers of slaves. Overall, the cluster achieves a 3.3-fold speedup compared to a native pure software implementation on a single ARM processor and about a 20% improvement compared to an ARM-based cluster without hardware accelerators.\",\"PeriodicalId\":344469,\"journal\":{\"name\":\"2013 International Conference on Field-Programmable Technology (FPT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Field-Programmable Technology (FPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPT.2013.6718411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2013.6718411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ARM-based servers are garnering increasing interest in big data processing for their low power consumption. However, they are ill-suited for compute-intensive tasks due to their poor processing capability compared to the CPUs used in a traditional server. This paper describes our early efforts to integrate the processing power of the FPGA with the ARM processor inside the Xilinx Zynq SoC. An eight-slave Zynq-based Hadoop cluster is built and a customized hardware accelerator for a standard FIR filter is implemented to demonstrate the effectiveness of hardware acceleration. The Xillybus is used for communication between the ARM processor and the FPGA fabric, achieving a bandwidth of 103MB/s. The Hadoop cluster is proved to be linearly scalable with different input sizes and numbers of slaves. Overall, the cluster achieves a 3.3-fold speedup compared to a native pure software implementation on a single ARM processor and about a 20% improvement compared to an ARM-based cluster without hardware accelerators.