S. Rinaldi, Federico Bonafini, P. Ferrari, A. Flammini, E. Sisinni, D. Bianchini
{"title":"Impact of Data Model on Performance of Time Series Database for Internet of Things Applications","authors":"S. Rinaldi, Federico Bonafini, P. Ferrari, A. Flammini, E. Sisinni, D. Bianchini","doi":"10.1109/I2MTC.2019.8827164","DOIUrl":null,"url":null,"abstract":"The Internet of Things (IoT) paradigm is gaining interest in several application fields, from medical devices to smart building and industrial automation. Such a success is due to the flexibility and interoperability between different application domains: the possibility to vertically share data among applications is the winning point of this technology. IoT sensors installed on the field generate a large amount of data, which have to be stored somewhere for subsequent analysis. Database technologies are experiencing a deep transformation to be able to handle these data streams. The recent trend is a transition from relational to non-relational databases. Among the latter, the Time Series Databases (TSDBs) seem to be the solution for storing large amount of time series data generated by IoT applications. Although these solutions are optimized to handle thousands of parallel data streams from IoT sensors, the performance of data extraction could not be compatible with some applications. The target of the paper is to investigate the impact that different metadata could have over the data extraction performance in TSDBs. A dedicated testing procedure has been configured for evaluating InfluxDB, one of the most effective and widespread TSDBs. The performance analysis, carried out on a specific use case, demonstrated that the database write and read performance can be significantly affected by the used data model, with queries executed on the same data requiring times from hundreds of ms to seconds in the worst cases.","PeriodicalId":132588,"journal":{"name":"2019 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2MTC.2019.8827164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
The Internet of Things (IoT) paradigm is gaining interest in several application fields, from medical devices to smart building and industrial automation. Such a success is due to the flexibility and interoperability between different application domains: the possibility to vertically share data among applications is the winning point of this technology. IoT sensors installed on the field generate a large amount of data, which have to be stored somewhere for subsequent analysis. Database technologies are experiencing a deep transformation to be able to handle these data streams. The recent trend is a transition from relational to non-relational databases. Among the latter, the Time Series Databases (TSDBs) seem to be the solution for storing large amount of time series data generated by IoT applications. Although these solutions are optimized to handle thousands of parallel data streams from IoT sensors, the performance of data extraction could not be compatible with some applications. The target of the paper is to investigate the impact that different metadata could have over the data extraction performance in TSDBs. A dedicated testing procedure has been configured for evaluating InfluxDB, one of the most effective and widespread TSDBs. The performance analysis, carried out on a specific use case, demonstrated that the database write and read performance can be significantly affected by the used data model, with queries executed on the same data requiring times from hundreds of ms to seconds in the worst cases.