Towards Provable Network Traffic Measurement and Analysis via Semi-Labeled Trace Datasets

Milan Cermák, Tomás Jirsík, P. Velan, Jana Komárková, Stanislav Špaček, Martin Drasar, Tomáš Plesník
{"title":"Towards Provable Network Traffic Measurement and Analysis via Semi-Labeled Trace Datasets","authors":"Milan Cermák, Tomás Jirsík, P. Velan, Jana Komárková, Stanislav Špaček, Martin Drasar, Tomáš Plesník","doi":"10.23919/TMA.2018.8506498","DOIUrl":null,"url":null,"abstract":"Research in network traffic measurement and analysis is a long-lasting field with growing interest from both scientists and the industry. However, even after so many years, results replication, criticism, and review are still rare. We face not only a lack of research standards, but also inaccessibility of appropriate datasets that can be used for methods development and evaluation. Therefore, a lot of potentially high-quality research cannot be verified and is not adopted by the industry or the community. The aim of this paper is to overcome this controversy with a unique solution based on a combination of distinct approaches proposed by other research works. Unlike these studies, we focus on the whole issue covering all areas of data anonymization, authenticity, recency, publicity, and their usage for research provability. We believe that these challenges can be solved by utilization of semi-labeled datasets composed of real-world network traffic and annotated units with interest-related packet traces only. In this paper, we outline the basic ideas of the methodology from unit trace collection and semi-labeled dataset creation to its usage for research evaluation. We strive for this proposal to start a discussion of the approach and help to overcome some of the challenges the research faces today.","PeriodicalId":6607,"journal":{"name":"2018 Network Traffic Measurement and Analysis Conference (TMA)","volume":"48 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Network Traffic Measurement and Analysis Conference (TMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/TMA.2018.8506498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Research in network traffic measurement and analysis is a long-lasting field with growing interest from both scientists and the industry. However, even after so many years, results replication, criticism, and review are still rare. We face not only a lack of research standards, but also inaccessibility of appropriate datasets that can be used for methods development and evaluation. Therefore, a lot of potentially high-quality research cannot be verified and is not adopted by the industry or the community. The aim of this paper is to overcome this controversy with a unique solution based on a combination of distinct approaches proposed by other research works. Unlike these studies, we focus on the whole issue covering all areas of data anonymization, authenticity, recency, publicity, and their usage for research provability. We believe that these challenges can be solved by utilization of semi-labeled datasets composed of real-world network traffic and annotated units with interest-related packet traces only. In this paper, we outline the basic ideas of the methodology from unit trace collection and semi-labeled dataset creation to its usage for research evaluation. We strive for this proposal to start a discussion of the approach and help to overcome some of the challenges the research faces today.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于半标记跟踪数据集的可证明网络流量测量与分析
网络流量测量和分析研究是一个长期存在的领域,越来越受到科学家和业界的关注。然而,即使经过了这么多年,结果的复制、批评和审查仍然很少。我们不仅面临缺乏研究标准的问题,而且还面临无法获得可用于方法开发和评估的适当数据集的问题。因此,许多潜在的高质量研究无法得到验证,也没有被行业或社区采用。本文的目的是克服这一争议与独特的解决方案基于不同的方法提出了其他研究工作的组合。与这些研究不同,我们关注的是整个问题,涵盖了数据匿名化、真实性、近代性、公共性及其用于研究可证明性的所有领域。我们相信这些挑战可以通过利用半标记数据集来解决,这些数据集由现实世界的网络流量和仅带有兴趣相关数据包跟踪的注释单元组成。在本文中,我们概述了该方法的基本思想,从单位跟踪收集和半标记数据集创建到其用于研究评估。我们希望这一提议能够引发对该方法的讨论,并帮助克服该研究今天面临的一些挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Analysis of Network Measurements Through Machine Learning: The Power of the Crowd App for Dynamic Crowdsourced QoE Studies of HTTP Adaptive Streaming on Mobile Devices Dmap: Automating Domain Name Ecosystem Measurements and Applications Anycaston the Move: A Look at Mobile Anycast Performance A Second Screen Journey to the Cup: Twitter Dynamics During the Stanley Cup Playoffs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1