{"title":"Design of real-time data analysis system based on Impala","authors":"Jingmin Li","doi":"10.1109/WARTIA.2014.6976427","DOIUrl":null,"url":null,"abstract":"With the continuous development of Internet technology, from a mass of data real-time, efficient analysis and dig out the valuable information, especially important for enterprises. At present, relatively common practice is built up data analysis system in the Hadoop environment based on Hive. But it is more suitable for the batch processing in large data of clusters, and is not suitable for the real-time processing of large data requirements brought about by the development of the business adjustment. This paper presents a real-time data analysis system based on Impala. It can be used as a good supplement scheme. This paper will explain the thought and method of the construction of the real-time data analysis system based on Impala, from the system selection, system architecture, and practical.","PeriodicalId":288854,"journal":{"name":"2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WARTIA.2014.6976427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
With the continuous development of Internet technology, from a mass of data real-time, efficient analysis and dig out the valuable information, especially important for enterprises. At present, relatively common practice is built up data analysis system in the Hadoop environment based on Hive. But it is more suitable for the batch processing in large data of clusters, and is not suitable for the real-time processing of large data requirements brought about by the development of the business adjustment. This paper presents a real-time data analysis system based on Impala. It can be used as a good supplement scheme. This paper will explain the thought and method of the construction of the real-time data analysis system based on Impala, from the system selection, system architecture, and practical.