Gang Lu, Jing Du, Ronghua Guo, Ying Zhou, Haipeng Fu
{"title":"Application feature extraction by using both dynamic binary tracking and statistical learning","authors":"Gang Lu, Jing Du, Ronghua Guo, Ying Zhou, Haipeng Fu","doi":"10.1109/icct.2017.8359983","DOIUrl":null,"url":null,"abstract":"While application feature extraction is popular in recent researches of traffic classification, only a few studies have extracted application features by synthetically analyzing packet payloads, port allocation and flow-level statistics. In this paper, we apply the techniques of both dynamic binary tracking and statistical learning in application feature extraction. Specifically, we first accurately capture the payload contents by reversely debugging an application in an automatic way, and then recursively cluster those contents to generate protocol signatures. Afterwards, we perform port statistical analysis to generate a port association rule. To identify the encrypted applications, we present a feature selection algorithm for selecting the optimal features from the time series statistics of the first ten packet sizes of each TCP flow. Compared with three typical feature selection algorithms, we validate that our proposed feature selection algorithm is more effectiveness. Additionally, we propose a scheme to synthetically apply protocol signatures, port association and flow statistics in traffic classification. By evaluating our method on the identification of Thunder flows, we show that the combination of protocol signatures, port association and flow statistics is promising in traffic classification.","PeriodicalId":199874,"journal":{"name":"2017 IEEE 17th International Conference on Communication Technology (ICCT)","volume":"89 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 17th International Conference on Communication Technology (ICCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icct.2017.8359983","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
While application feature extraction is popular in recent researches of traffic classification, only a few studies have extracted application features by synthetically analyzing packet payloads, port allocation and flow-level statistics. In this paper, we apply the techniques of both dynamic binary tracking and statistical learning in application feature extraction. Specifically, we first accurately capture the payload contents by reversely debugging an application in an automatic way, and then recursively cluster those contents to generate protocol signatures. Afterwards, we perform port statistical analysis to generate a port association rule. To identify the encrypted applications, we present a feature selection algorithm for selecting the optimal features from the time series statistics of the first ten packet sizes of each TCP flow. Compared with three typical feature selection algorithms, we validate that our proposed feature selection algorithm is more effectiveness. Additionally, we propose a scheme to synthetically apply protocol signatures, port association and flow statistics in traffic classification. By evaluating our method on the identification of Thunder flows, we show that the combination of protocol signatures, port association and flow statistics is promising in traffic classification.