{"title":"Research on performance optimization and visualization tool of Hadoop","authors":"Yan Xu, Wei Zhou, Binyue Cui, Lingyun Lu","doi":"10.1109/ICCSE.2015.7250233","DOIUrl":null,"url":null,"abstract":"Hadoop, a distributed system infrastructure, is developed by Apache Software Foundation and has become a mainstream platform of cloud-computing. How to improve one of its core frame work-MapReduce performance has become a hot topic. However, how to get a better computational performance is still a big challenge for programmers. It appears to be many visualization tools for performance analysis and optimization because of the research of display and analyze program performance by the aid of visualization technology is receiving more and more attention. This paper analyzed sorting performance in Map Phase of Hadoop System and proposed a method to optimize the sorting performance dynamically. It collected and analyzed ten visualization tools that are the most popular in the global world, and found that R language is the tool suited to Hadoop through comparison and analysis, and introduced the combination of R language and Hadoop. In the future, we will apply RHadoop to MapReduce performance optimization.","PeriodicalId":311451,"journal":{"name":"2015 10th International Conference on Computer Science & Education (ICCSE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2015.7250233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Hadoop, a distributed system infrastructure, is developed by Apache Software Foundation and has become a mainstream platform of cloud-computing. How to improve one of its core frame work-MapReduce performance has become a hot topic. However, how to get a better computational performance is still a big challenge for programmers. It appears to be many visualization tools for performance analysis and optimization because of the research of display and analyze program performance by the aid of visualization technology is receiving more and more attention. This paper analyzed sorting performance in Map Phase of Hadoop System and proposed a method to optimize the sorting performance dynamically. It collected and analyzed ten visualization tools that are the most popular in the global world, and found that R language is the tool suited to Hadoop through comparison and analysis, and introduced the combination of R language and Hadoop. In the future, we will apply RHadoop to MapReduce performance optimization.