M. Wullink, G. Moura, M. Müller, Cristian Hesselman
{"title":"ENTRADA: A high-performance network traffic data streaming warehouse","authors":"M. Wullink, G. Moura, M. Müller, Cristian Hesselman","doi":"10.1109/NOMS.2016.7502925","DOIUrl":null,"url":null,"abstract":"We present ENTRADA, a high-performance data streaming warehouse that enables researchers and operators to analyze vast amounts of network traffic and measurement data within interactive response times (seconds to few minutes), even in a small computer cluster. ENTRADA delivers such performance by employing a optimized file format and a high-performance query engine, both open-source. ENTRADA has been operational for more than 1.5 years, having ingested more than 100 TB of pcap files from two .nl DNS authoritative servers. As we discuss, we use this data in projects that aim at further increasing the security and stability of the .nl zone. We present in this paper our design choices, experiences, and a performance evaluation of ENTRADA. Finally, we open-source ENTRADA, which can be used “out-of-the-box” by researchers, operators, and registries to deploy their own networking analysis clusters for DNS traffic, and can be easily extended to handle any other structured data.","PeriodicalId":344879,"journal":{"name":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2016.7502925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54
Abstract
We present ENTRADA, a high-performance data streaming warehouse that enables researchers and operators to analyze vast amounts of network traffic and measurement data within interactive response times (seconds to few minutes), even in a small computer cluster. ENTRADA delivers such performance by employing a optimized file format and a high-performance query engine, both open-source. ENTRADA has been operational for more than 1.5 years, having ingested more than 100 TB of pcap files from two .nl DNS authoritative servers. As we discuss, we use this data in projects that aim at further increasing the security and stability of the .nl zone. We present in this paper our design choices, experiences, and a performance evaluation of ENTRADA. Finally, we open-source ENTRADA, which can be used “out-of-the-box” by researchers, operators, and registries to deploy their own networking analysis clusters for DNS traffic, and can be easily extended to handle any other structured data.