B. Agrawal, Antorweep Chakravorty, Chunming Rong, T. Wlodarczyk
{"title":"R2Time:一个在HBase中分析开放TSDB时间序列数据的框架","authors":"B. Agrawal, Antorweep Chakravorty, Chunming Rong, T. Wlodarczyk","doi":"10.1109/CloudCom.2014.84","DOIUrl":null,"url":null,"abstract":"In recent years, the amount of time series data generated in different domains have grown consistently. Analyzing large time-series datasets coming from sensor networks, power grids, stock exchanges, social networks and cloud monitoring logs at a massive scale is one of the biggest challenges that data scientists are facing. Big data storage and processing frameworks provides an environment to handle the volume, velocity and frequency attributes associated with time-series data. We propose an efficient and distributed computing framework - R2Time for processing such data in the Hadoop environment. It integrates R with a distributed time-series database (Open TSDB) using a MapReduce programming framework (RHIPE). R2Time allows analysts to work on huge datasets from within a popular, well supported, and powerful analysis environment.","PeriodicalId":249306,"journal":{"name":"2014 IEEE 6th International Conference on Cloud Computing Technology and Science","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"R2Time: A Framework to Analyse Open TSDB Time-Series Data in HBase\",\"authors\":\"B. Agrawal, Antorweep Chakravorty, Chunming Rong, T. Wlodarczyk\",\"doi\":\"10.1109/CloudCom.2014.84\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the amount of time series data generated in different domains have grown consistently. Analyzing large time-series datasets coming from sensor networks, power grids, stock exchanges, social networks and cloud monitoring logs at a massive scale is one of the biggest challenges that data scientists are facing. Big data storage and processing frameworks provides an environment to handle the volume, velocity and frequency attributes associated with time-series data. We propose an efficient and distributed computing framework - R2Time for processing such data in the Hadoop environment. It integrates R with a distributed time-series database (Open TSDB) using a MapReduce programming framework (RHIPE). R2Time allows analysts to work on huge datasets from within a popular, well supported, and powerful analysis environment.\",\"PeriodicalId\":249306,\"journal\":{\"name\":\"2014 IEEE 6th International Conference on Cloud Computing Technology and Science\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 6th International Conference on Cloud Computing Technology and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CloudCom.2014.84\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 6th International Conference on Cloud Computing Technology and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudCom.2014.84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
R2Time: A Framework to Analyse Open TSDB Time-Series Data in HBase
In recent years, the amount of time series data generated in different domains have grown consistently. Analyzing large time-series datasets coming from sensor networks, power grids, stock exchanges, social networks and cloud monitoring logs at a massive scale is one of the biggest challenges that data scientists are facing. Big data storage and processing frameworks provides an environment to handle the volume, velocity and frequency attributes associated with time-series data. We propose an efficient and distributed computing framework - R2Time for processing such data in the Hadoop environment. It integrates R with a distributed time-series database (Open TSDB) using a MapReduce programming framework (RHIPE). R2Time allows analysts to work on huge datasets from within a popular, well supported, and powerful analysis environment.