{"title":"Network traffic data to ARFF converter for association rules technique of data mining","authors":"Nattawat Khamphakdee, N. Benjamas, Saiyan Saiyod","doi":"10.1109/ICOS.2014.7042635","DOIUrl":null,"url":null,"abstract":"Network traffic data is communication data of user on the network. It is a large data and it also consists of normal and abnormal pattern behavior. The analysis and detection of the abnormal pattern behavior in the network traffic data must spend a long time and very hard to find the intrusion pattern. However, the data mining technology can be utilized to extract normal and abnormal pattern behavior. In addition, an association rules technique is one kind of the data mining technology and it be widely utilized to find a pattern. It can discover the events that frequently occur in these data. In order to find the intrusion pattern, the network traffic data must be converted to the special format for the data mining process. In this paper, we propose the network traffic data to ARFF convertor for the association rules technique of the data mining. We developed the software by using Java language and Weka library. In order to evaluate the performance, we utilized the data set of the MIT-DAPRA 1999 in both week 4th and week 5th. Firstly, we wrote the Snort-IDS rules to detect the data set then record the alert data to mysql database. Secondly, the attributes of the header protocol from snort database will be selected such as tcp, icmp and udp protocol, then save the selected data as .csv file format. Thirdly, the .csv file will be converted to .arff file format by utilizing the Weka library. Finally, we used an apriori algorithm of the association rules mining technique to discover relation of itemsets in the data set. As the experimental result, our application can match the pattern that able to discover the frequent itemsets from the data set then it can generate the association rules which are helpful for computer and network administrator to analyze user behavior. In addition, the attribute of our application can be assigned the number of the attribute in the rule. Thus, the generated rules are able to apply with the intrusion detection system.","PeriodicalId":146332,"journal":{"name":"2014 IEEE Conference on Open Systems (ICOS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Conference on Open Systems (ICOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOS.2014.7042635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Network traffic data is communication data of user on the network. It is a large data and it also consists of normal and abnormal pattern behavior. The analysis and detection of the abnormal pattern behavior in the network traffic data must spend a long time and very hard to find the intrusion pattern. However, the data mining technology can be utilized to extract normal and abnormal pattern behavior. In addition, an association rules technique is one kind of the data mining technology and it be widely utilized to find a pattern. It can discover the events that frequently occur in these data. In order to find the intrusion pattern, the network traffic data must be converted to the special format for the data mining process. In this paper, we propose the network traffic data to ARFF convertor for the association rules technique of the data mining. We developed the software by using Java language and Weka library. In order to evaluate the performance, we utilized the data set of the MIT-DAPRA 1999 in both week 4th and week 5th. Firstly, we wrote the Snort-IDS rules to detect the data set then record the alert data to mysql database. Secondly, the attributes of the header protocol from snort database will be selected such as tcp, icmp and udp protocol, then save the selected data as .csv file format. Thirdly, the .csv file will be converted to .arff file format by utilizing the Weka library. Finally, we used an apriori algorithm of the association rules mining technique to discover relation of itemsets in the data set. As the experimental result, our application can match the pattern that able to discover the frequent itemsets from the data set then it can generate the association rules which are helpful for computer and network administrator to analyze user behavior. In addition, the attribute of our application can be assigned the number of the attribute in the rule. Thus, the generated rules are able to apply with the intrusion detection system.