Sina Fathi Kazerooni, Yagiz Kaymak, R. Rojas-Cessa
{"title":"Tracking User Application Activity by using Machine Learning Techniques on Network Traffic","authors":"Sina Fathi Kazerooni, Yagiz Kaymak, R. Rojas-Cessa","doi":"10.1109/ICAIIC.2019.8669040","DOIUrl":null,"url":null,"abstract":"A network eavesdropper may invade the privacy of an online user by collecting the passing traffic and classifying the applications that generated the network traffic. This collection may be used to build fingerprints of the user’s Internet usage. In this paper, we investigate the feasibility of performing such breach on encrypted network traffic generated by actual users. We adopt the random forest algorithm to classify the applications in use by users of a campus network. Our classification system identifies and quantifies different statistical features of user’s network traffic to classify applications rather than looking into packet contents. In addition, application classification is performed without employing a port mapping at the transport layer. Our results show that applications can be identified with an average precision and recall of up to 99%.","PeriodicalId":273383,"journal":{"name":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC.2019.8669040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
A network eavesdropper may invade the privacy of an online user by collecting the passing traffic and classifying the applications that generated the network traffic. This collection may be used to build fingerprints of the user’s Internet usage. In this paper, we investigate the feasibility of performing such breach on encrypted network traffic generated by actual users. We adopt the random forest algorithm to classify the applications in use by users of a campus network. Our classification system identifies and quantifies different statistical features of user’s network traffic to classify applications rather than looking into packet contents. In addition, application classification is performed without employing a port mapping at the transport layer. Our results show that applications can be identified with an average precision and recall of up to 99%.