GML学习，用于网络测量分析的通用机器学习模型

2017 13th International Conference on Network and Service Management (CNSM) Pub Date : 2017-11-01 DOI:10.23919/CNSM.2017.8255998

P. Casas, J. Vanerio, K. Fukuda

{"title":"GML学习，用于网络测量分析的通用机器学习模型","authors":"P. Casas, J. Vanerio, K. Fukuda","doi":"10.23919/CNSM.2017.8255998","DOIUrl":null,"url":null,"abstract":"The application of machine learning models to the analysis of network measurement problems has largely increased in the last decade; however, there is still no clear best-practice or silver bullet approach to address these problems in a general context, and only adhoc and tailored approaches have been evaluated so far. While deep-learning models have provided a major breakthrough in highly-dimensional problems such as image processing, it is difficult to say today which is the best model to address the analysis of large volumes of highly-dimensional data collected in operational networks. In this paper we present a potential solution to fill this gap, exploring the application of ensemble learning models to multiple network measurement problems. We introduce GML Learning, a generic Machine Learning model for the analysis of network measurements. The GML model is a generalization of the well-known stacking approach to ensemble learning, and follows the concepts of the Super Learner model. The Super Learner performs asymptotically as well as the best input base or weak learners, providing a very powerful approach to tackle multiple problems with the same technique. In addition, it defines an approach to minimize over-fitting likelihood during training, using a variant of cross-validation. We deploy the GML model on top of Big-DAMA, a big data analytics framework for network measurement applications. We test the proposed solution in five different and assorted network measurement problems, including detection of network attacks and anomalies, QoE modeling and prediction, and Internet-paths dynamics tracking. Results confirm that the GML model provides better results than any of the single baseline models of the stack, and outperforms traditional bagging and boosting ensemble learning approaches. The GML Learning model opens the door for a generalization of a best-practice technique for the analysis of network measurements.","PeriodicalId":211611,"journal":{"name":"2017 13th International Conference on Network and Service Management (CNSM)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"GML learning, a generic machine learning model for network measurements analysis\",\"authors\":\"P. Casas, J. Vanerio, K. Fukuda\",\"doi\":\"10.23919/CNSM.2017.8255998\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of machine learning models to the analysis of network measurement problems has largely increased in the last decade; however, there is still no clear best-practice or silver bullet approach to address these problems in a general context, and only adhoc and tailored approaches have been evaluated so far. While deep-learning models have provided a major breakthrough in highly-dimensional problems such as image processing, it is difficult to say today which is the best model to address the analysis of large volumes of highly-dimensional data collected in operational networks. In this paper we present a potential solution to fill this gap, exploring the application of ensemble learning models to multiple network measurement problems. We introduce GML Learning, a generic Machine Learning model for the analysis of network measurements. The GML model is a generalization of the well-known stacking approach to ensemble learning, and follows the concepts of the Super Learner model. The Super Learner performs asymptotically as well as the best input base or weak learners, providing a very powerful approach to tackle multiple problems with the same technique. In addition, it defines an approach to minimize over-fitting likelihood during training, using a variant of cross-validation. We deploy the GML model on top of Big-DAMA, a big data analytics framework for network measurement applications. We test the proposed solution in five different and assorted network measurement problems, including detection of network attacks and anomalies, QoE modeling and prediction, and Internet-paths dynamics tracking. Results confirm that the GML model provides better results than any of the single baseline models of the stack, and outperforms traditional bagging and boosting ensemble learning approaches. The GML Learning model opens the door for a generalization of a best-practice technique for the analysis of network measurements.\",\"PeriodicalId\":211611,\"journal\":{\"name\":\"2017 13th International Conference on Network and Service Management (CNSM)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 13th International Conference on Network and Service Management (CNSM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/CNSM.2017.8255998\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 13th International Conference on Network and Service Management (CNSM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CNSM.2017.8255998","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

摘要

在过去十年中，机器学习模型在网络测量问题分析中的应用大大增加;然而，在一般情况下，仍然没有明确的最佳实践或银弹方法来解决这些问题，到目前为止，只评估了专门的和量身定制的方法。虽然深度学习模型在图像处理等高维问题上取得了重大突破，但今天很难说哪一个模型是处理在操作网络中收集的大量高维数据分析的最佳模型。在本文中，我们提出了一个潜在的解决方案来填补这一空白，探索集成学习模型在多个网络测量问题中的应用。我们介绍了GML学习，一种用于网络测量分析的通用机器学习模型。GML模型是众所周知的集成学习的叠加方法的推广，并遵循超级学习者模型的概念。超级学习器的表现与最佳输入基础或弱学习器一样好，提供了一种非常强大的方法，可以用相同的技术解决多个问题。此外，它还定义了一种在训练过程中使用交叉验证的变体来最小化过拟合可能性的方法。我们将GML模型部署在big - dama之上，big - dama是一个用于网络测量应用的大数据分析框架。我们在五个不同的网络测量问题中测试了所提出的解决方案，包括网络攻击和异常检测，QoE建模和预测，以及互联网路径动态跟踪。结果证实，GML模型提供了比任何堆栈的单一基线模型更好的结果，并且优于传统的bagging和boosting集成学习方法。GML学习模型为网络测量分析的最佳实践技术的泛化打开了大门。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GML learning, a generic machine learning model for network measurements analysis

The application of machine learning models to the analysis of network measurement problems has largely increased in the last decade; however, there is still no clear best-practice or silver bullet approach to address these problems in a general context, and only adhoc and tailored approaches have been evaluated so far. While deep-learning models have provided a major breakthrough in highly-dimensional problems such as image processing, it is difficult to say today which is the best model to address the analysis of large volumes of highly-dimensional data collected in operational networks. In this paper we present a potential solution to fill this gap, exploring the application of ensemble learning models to multiple network measurement problems. We introduce GML Learning, a generic Machine Learning model for the analysis of network measurements. The GML model is a generalization of the well-known stacking approach to ensemble learning, and follows the concepts of the Super Learner model. The Super Learner performs asymptotically as well as the best input base or weak learners, providing a very powerful approach to tackle multiple problems with the same technique. In addition, it defines an approach to minimize over-fitting likelihood during training, using a variant of cross-validation. We deploy the GML model on top of Big-DAMA, a big data analytics framework for network measurement applications. We test the proposed solution in five different and assorted network measurement problems, including detection of network attacks and anomalies, QoE modeling and prediction, and Internet-paths dynamics tracking. Results confirm that the GML model provides better results than any of the single baseline models of the stack, and outperforms traditional bagging and boosting ensemble learning approaches. The GML Learning model opens the door for a generalization of a best-practice technique for the analysis of network measurements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 13th International Conference on Network and Service Management (CNSM)

自引率

0.00%

发文量

期刊最新文献

Measuring exposure in DDoS protection services Connectivity extraction in cloud infrastructures An evolutionary controllers' placement algorithm for reliable SDN networks A lightweight snapshot-based DDoS detector Enforcing free roaming among EU countries: An economic analysis