走向系统化的交通标注

Romain Fontugne, P. Borgnat, P. Abry, K. Fukuda
{"title":"走向系统化的交通标注","authors":"Romain Fontugne, P. Borgnat, P. Abry, K. Fukuda","doi":"10.1145/1658997.1659006","DOIUrl":null,"url":null,"abstract":"Maintaining Internet network resources available and se-cured is an unmet challenge. Hence, tra\u000ec classi cationand anomaly detection received much attention in the lastfew years, and several algorithms have been proposed forbackbone tra\u000ec. However, the evaluation of these methodsusually lacks rigor, leading to hasty conclusions. Since syn-thetic data is rather criticized and common labeled database(like the data sets from the DARPA Intrusion DetectionEvaluation Program [6]) is not available for backbone traf- c; researchers analyze real data and validate their methodsby manually inspecting their results, or by comparing theirresults with other methods. Our nal goal is to label theMAWI database [2] which is an archive of real backbonetra\u000ec traces publicly available. Since manual labeling ofbackbone tra\u000ec is unpractical, we build this database bycross-validating results from several methods with di erenttheoretical backgrounds. This systematic approach permitsto maintain updated database in which recent tra\u000ec tracesare regularly added, and labels are improved with upcomingalgorithms. In this paper we discuss the di\u000eculties facedin comparing events provided by distinct algorithms, andpropose a methodology to achieve it.This work will also help researchers in understanding re-sults from their algorithms. For instance, while developinganomaly detector, researchers commonly face a problem intuning their parameter set. The correlation between ana-lyzed tra\u000ec and parameter set is complicated. Therefore,researchers usually run their application with numerous pa-rameter settings, and the best parameter set is selected bylooking at the highest detection rate. Although this processis commonly accepted by the community a crucial issue stillremains. Let say a parameter set A gives a similar detec-tion rate than a parameter set B , but a deeper analysis ofreported events shows that B is more e ective for a certainkind of anomalies not detectable with the parameter set A(and vice versa). This case is important and should notbe ignored, however, it cannot be observed with a simplecomparison of detection rate.","PeriodicalId":181045,"journal":{"name":"Co-Next Student Workshop '09","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards systematic traffic annotation\",\"authors\":\"Romain Fontugne, P. Borgnat, P. Abry, K. Fukuda\",\"doi\":\"10.1145/1658997.1659006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Maintaining Internet network resources available and se-cured is an unmet challenge. Hence, tra\\u000ec classi cationand anomaly detection received much attention in the lastfew years, and several algorithms have been proposed forbackbone tra\\u000ec. However, the evaluation of these methodsusually lacks rigor, leading to hasty conclusions. Since syn-thetic data is rather criticized and common labeled database(like the data sets from the DARPA Intrusion DetectionEvaluation Program [6]) is not available for backbone traf- c; researchers analyze real data and validate their methodsby manually inspecting their results, or by comparing theirresults with other methods. Our nal goal is to label theMAWI database [2] which is an archive of real backbonetra\\u000ec traces publicly available. Since manual labeling ofbackbone tra\\u000ec is unpractical, we build this database bycross-validating results from several methods with di erenttheoretical backgrounds. This systematic approach permitsto maintain updated database in which recent tra\\u000ec tracesare regularly added, and labels are improved with upcomingalgorithms. In this paper we discuss the di\\u000eculties facedin comparing events provided by distinct algorithms, andpropose a methodology to achieve it.This work will also help researchers in understanding re-sults from their algorithms. For instance, while developinganomaly detector, researchers commonly face a problem intuning their parameter set. The correlation between ana-lyzed tra\\u000ec and parameter set is complicated. Therefore,researchers usually run their application with numerous pa-rameter settings, and the best parameter set is selected bylooking at the highest detection rate. Although this processis commonly accepted by the community a crucial issue stillremains. Let say a parameter set A gives a similar detec-tion rate than a parameter set B , but a deeper analysis ofreported events shows that B is more e ective for a certainkind of anomalies not detectable with the parameter set A(and vice versa). This case is important and should notbe ignored, however, it cannot be observed with a simplecomparison of detection rate.\",\"PeriodicalId\":181045,\"journal\":{\"name\":\"Co-Next Student Workshop '09\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Co-Next Student Workshop '09\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1658997.1659006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Co-Next Student Workshop '09","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1658997.1659006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

维护Internet网络资源的可用性和安全性是一个尚未解决的挑战。因此,在过去的几年中,trac分类和异常检测受到了广泛的关注,并提出了几种针对主干trac的算法。然而,对这些方法的评价通常缺乏严谨性,导致结论草率。由于合成数据是相当批评和通用标记数据库(如来自DARPA入侵检测评估计划[6]的数据集)不可用于骨干流量- c;研究人员分析真实数据,并通过手工检查结果或将结果与其他方法进行比较来验证他们的方法。我们的最终目标是将mawi数据库[2]标记为公开可用的真实骨干网c跟踪的存档。由于手工标记主干trac是不切实际的,我们通过交叉验证来自不同理论背景的几种方法的结果来建立这个数据库。这种系统的方法允许维护更新的数据库,其中定期添加最近的trac跟踪,并使用即将推出的算法改进标签。在本文中,我们讨论了在比较不同算法提供的事件时所面临的di困难,并提出了一种实现方法。这项工作也将帮助研究人员理解他们的算法的结果。例如,在开发异常检测器时,研究人员通常面临参数集调整的问题。分析后的trac与参数集之间的相关性比较复杂。因此,研究人员通常在运行应用程序时设置多个pa参数,并通过查看最高的检测率来选择最佳参数集。尽管这一过程被社会普遍接受,但仍然存在一个关键问题。假设参数集a的检测率与参数集B相似,但是对报告事件的更深入分析表明,对于参数集a无法检测到的某种异常,B更有效(反之亦然)。这个病例很重要,不容忽视,但不能简单的比较检出率来观察。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards systematic traffic annotation
Maintaining Internet network resources available and se-cured is an unmet challenge. Hence, trac classi cationand anomaly detection received much attention in the lastfew years, and several algorithms have been proposed forbackbone trac. However, the evaluation of these methodsusually lacks rigor, leading to hasty conclusions. Since syn-thetic data is rather criticized and common labeled database(like the data sets from the DARPA Intrusion DetectionEvaluation Program [6]) is not available for backbone traf- c; researchers analyze real data and validate their methodsby manually inspecting their results, or by comparing theirresults with other methods. Our nal goal is to label theMAWI database [2] which is an archive of real backbonetrac traces publicly available. Since manual labeling ofbackbone trac is unpractical, we build this database bycross-validating results from several methods with di erenttheoretical backgrounds. This systematic approach permitsto maintain updated database in which recent trac tracesare regularly added, and labels are improved with upcomingalgorithms. In this paper we discuss the diculties facedin comparing events provided by distinct algorithms, andpropose a methodology to achieve it.This work will also help researchers in understanding re-sults from their algorithms. For instance, while developinganomaly detector, researchers commonly face a problem intuning their parameter set. The correlation between ana-lyzed trac and parameter set is complicated. Therefore,researchers usually run their application with numerous pa-rameter settings, and the best parameter set is selected bylooking at the highest detection rate. Although this processis commonly accepted by the community a crucial issue stillremains. Let say a parameter set A gives a similar detec-tion rate than a parameter set B , but a deeper analysis ofreported events shows that B is more e ective for a certainkind of anomalies not detectable with the parameter set A(and vice versa). This case is important and should notbe ignored, however, it cannot be observed with a simplecomparison of detection rate.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Self-propagating worms in wireless sensor networks A robust pair-wise rekeying protocol in hierarchical wireless sensor networks Practical DHT-based location service for wireless mesh networks Improving roamer retention by exposing weak locations in GSM networks Netflow based system for NAT detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1