{"title":"Network Transmission Flags Data Affinity-based Classification by K-Nearest Neighbor","authors":"N. Aljojo","doi":"10.14500/aro.10880","DOIUrl":null,"url":null,"abstract":"Abstract—This research is concerned with the data generated during a network transmission session to understand how to extract value from the data generated and be able to conduct tasks. Instead of comparing all of the transmission flags for a transmission session at the same time to conduct any analysis, this paper conceptualized the influence of each transmission flag on network-aware applications by comparing the flags one by one on their impact to the application during the transmission session, rather than comparing all of the transmission flags at the same time. The K-nearest neighbor (KNN) type classification was used becauseit is a simple distance-based learning algorithm that remembers earlier training samples and is suitable for taking various flags withtheir effect on application protocols by comparing each new sample with the K-nearest points to make a decision. We used transmission session datasets received from Kaggle for IP flow with 87 features and 3.577.296 instances. We picked 13 features from the datasets and ran them through KNN. RapidMiner was used for the study, and the results of the experiments revealed that the KNN-based model was not only significantly more accurate in categorizing data, but it was also significantly more efficient due to the decreased processing costs.","PeriodicalId":8398,"journal":{"name":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14500/aro.10880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 1
Abstract
Abstract—This research is concerned with the data generated during a network transmission session to understand how to extract value from the data generated and be able to conduct tasks. Instead of comparing all of the transmission flags for a transmission session at the same time to conduct any analysis, this paper conceptualized the influence of each transmission flag on network-aware applications by comparing the flags one by one on their impact to the application during the transmission session, rather than comparing all of the transmission flags at the same time. The K-nearest neighbor (KNN) type classification was used becauseit is a simple distance-based learning algorithm that remembers earlier training samples and is suitable for taking various flags withtheir effect on application protocols by comparing each new sample with the K-nearest points to make a decision. We used transmission session datasets received from Kaggle for IP flow with 87 features and 3.577.296 instances. We picked 13 features from the datasets and ran them through KNN. RapidMiner was used for the study, and the results of the experiments revealed that the KNN-based model was not only significantly more accurate in categorizing data, but it was also significantly more efficient due to the decreased processing costs.