Chen Hau Wang, Ching TsorngTsai, Chia-Chen Fan, S. Yuan
{"title":"A Hadoop Based Weblog Analysis System","authors":"Chen Hau Wang, Ching TsorngTsai, Chia-Chen Fan, S. Yuan","doi":"10.1109/U-MEDIA.2014.9","DOIUrl":null,"url":null,"abstract":"In recent years, cloud computing has been an important issue in the field of research. Cloud computing employs distributed storage and distributed computing technology to achieve a large number of stored data, as well as fast data analysis and processing. As the rapid development of Internet technology, digital data showing explosive growth, the face of massive data processing, the traditional text software and relational database technology has been facing a bottleneck, presented the results are not very satisfactory. For this problem, the concept of cloud computing is a more appropriate choice. In this paper, based on the architecture of Hadoop with HDFS (Hadoop Distributed File System) and Hadoop MapReduce software framework and Pig Latin language, we design and implement an enterprise Web log analysis system. Experimental results, by analyzing daily Web log records, we get Application Server traffic trends, performance of program statistical reports, and performance reports of different intervals and different actions of program by user request. The main purpose of this system is to assist system administrators to quickly capture and analyze data hidden in the massive potential value, thus providing an important basis for business decisions.","PeriodicalId":174849,"journal":{"name":"2014 7th International Conference on Ubi-Media Computing and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 7th International Conference on Ubi-Media Computing and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/U-MEDIA.2014.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
In recent years, cloud computing has been an important issue in the field of research. Cloud computing employs distributed storage and distributed computing technology to achieve a large number of stored data, as well as fast data analysis and processing. As the rapid development of Internet technology, digital data showing explosive growth, the face of massive data processing, the traditional text software and relational database technology has been facing a bottleneck, presented the results are not very satisfactory. For this problem, the concept of cloud computing is a more appropriate choice. In this paper, based on the architecture of Hadoop with HDFS (Hadoop Distributed File System) and Hadoop MapReduce software framework and Pig Latin language, we design and implement an enterprise Web log analysis system. Experimental results, by analyzing daily Web log records, we get Application Server traffic trends, performance of program statistical reports, and performance reports of different intervals and different actions of program by user request. The main purpose of this system is to assist system administrators to quickly capture and analyze data hidden in the massive potential value, thus providing an important basis for business decisions.