首页 > 最新文献

2011 3rd Conference on Data Mining and Optimization (DMO)最新文献

英文 中文
A framework of rough reducts optimization based on PSO/ACO hybridized algorithms 基于粒子群算法和蚁群算法的粗糙约简优化框架
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976520
Lustiana Pratiwi, Y. Choo, A. Muda
Rough reducts has contributed significantly in numerous researches of feature selection analysis. It has been proven as a reliable reduction technique in identifying the importance of attributes set in an information system. The key factor for the success of reducts calculation in finding minimal reduct with minimal cardinality of attributes is an NP-Hard problem. This paper has proposed an improved PSO/ACO optimization framework to enhance rough reduct performance by reducing the computational complexities. The proposed framework consists of a three-stage optimization process, i.e. global optimization with PSO, local optimization with ACO and vaccination process on discernibility matrix.
粗约简在许多特征选择分析的研究中有着重要的贡献。它被证明是一种确定信息系统中属性集重要性的可靠约简技术。约简计算成功的关键因素是寻找具有最小属性基数的最小约简,这是一个NP-Hard问题。本文提出了一种改进的粒子群/蚁群优化框架,通过降低计算复杂度来提高粗糙约简性能。该框架包括三个阶段的优化过程,即利用粒子群算法进行全局优化、利用蚁群算法进行局部优化和基于差别矩阵的疫苗接种过程。
{"title":"A framework of rough reducts optimization based on PSO/ACO hybridized algorithms","authors":"Lustiana Pratiwi, Y. Choo, A. Muda","doi":"10.1109/DMO.2011.5976520","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976520","url":null,"abstract":"Rough reducts has contributed significantly in numerous researches of feature selection analysis. It has been proven as a reliable reduction technique in identifying the importance of attributes set in an information system. The key factor for the success of reducts calculation in finding minimal reduct with minimal cardinality of attributes is an NP-Hard problem. This paper has proposed an improved PSO/ACO optimization framework to enhance rough reduct performance by reducing the computational complexities. The proposed framework consists of a three-stage optimization process, i.e. global optimization with PSO, local optimization with ACO and vaccination process on discernibility matrix.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131872168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A hybrid evaluation metric for optimizing classifier 一种用于分类器优化的混合评价指标
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976522
M. Hossin, M. Sulaiman, A. Mustapha, N. Mustapha, R. Rahmat
The accuracy metric has been widely used for discriminating and selecting an optimal solution in constructing an optimized classifier. However, the use of accuracy metric leads the searching process to the sub-optimal solutions due to its limited capability of discriminating values. In this study, we propose a hybrid evaluation metric, which combines the accuracy metric with the precision and recall metrics. We call this new performance metric as Optimized Accuracy with Recall-Precision (OARP). This paper demonstrates that the OARP metric is more discriminating than the accuracy metric using two counter-examples. To verify this advantage, we conduct an empirical verification using a statistical discriminative analysis to prove that the OARP is statistically more discriminating than the accuracy metric. We also empirically demonstrate that a naive stochastic classification algorithm trained with the OARP metric is able to obtain better predictive results than the one trained with the conventional accuracy metric. The experiments have proved that the OARP metric is a better evaluator and optimizer in the constructing of optimized classifier.
在构造优化分类器时,精度度量被广泛用于判别和选择最优解。然而,精度度量的使用由于其区分值的能力有限,导致搜索过程中出现次优解。在这项研究中,我们提出了一种混合评价指标,将准确度指标与精度和召回率指标相结合。我们将这种新的性能指标称为带召回精度的优化精度(OARP)。本文用两个反例证明了opp度量比精度度量更有鉴别性。为了验证这一优势,我们使用统计判别分析进行了实证验证,以证明oparp在统计上比精度度量更具判别性。我们还通过经验证明,使用oparp度量训练的朴素随机分类算法能够获得比使用传统精度度量训练的算法更好的预测结果。实验证明,在构造优化分类器时,opp度量是一个较好的评价器和优化器。
{"title":"A hybrid evaluation metric for optimizing classifier","authors":"M. Hossin, M. Sulaiman, A. Mustapha, N. Mustapha, R. Rahmat","doi":"10.1109/DMO.2011.5976522","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976522","url":null,"abstract":"The accuracy metric has been widely used for discriminating and selecting an optimal solution in constructing an optimized classifier. However, the use of accuracy metric leads the searching process to the sub-optimal solutions due to its limited capability of discriminating values. In this study, we propose a hybrid evaluation metric, which combines the accuracy metric with the precision and recall metrics. We call this new performance metric as Optimized Accuracy with Recall-Precision (OARP). This paper demonstrates that the OARP metric is more discriminating than the accuracy metric using two counter-examples. To verify this advantage, we conduct an empirical verification using a statistical discriminative analysis to prove that the OARP is statistically more discriminating than the accuracy metric. We also empirically demonstrate that a naive stochastic classification algorithm trained with the OARP metric is able to obtain better predictive results than the one trained with the conventional accuracy metric. The experiments have proved that the OARP metric is a better evaluator and optimizer in the constructing of optimized classifier.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131966476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
High order fuzzy time series for exchange rates forecasting 用于汇率预测的高阶模糊时间序列
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976496
L. Abdullah, I. Taib
Fuzzy time series model has been employed by many researchers in various forecasting activities such as university enrolment, temperature, direct tax collection and the most popular stock price forecasting. However exchange rate forecasting especially using high order fuzzy time series has been given less attention despite its huge contribution in business transactions. The paper aims to test the forecasting of US dollar (USD) against Malaysian Ringgit (MYR) exchange rates using high order fuzzy time series and check its accuracy. Twenty five data set of the exchange rates USD against MYR was tested to the seven-step of high fuzzy time series. The results show that higher order fuzzy time series yield very small errors thereby the model does produce a good forecasting tool for the exchange rates.
模糊时间序列模型已被许多研究者用于各种预测活动,如大学招生、温度、直接税征收和最流行的股票价格预测。然而,利用高阶模糊时间序列的汇率预测在商业交易中有着巨大的贡献,却很少受到重视。本文旨在利用高阶模糊时间序列对美元对马来西亚林吉特汇率的预测进行检验,并检验其准确性。对25组美元兑马来西亚林吉特汇率数据进行了七步高模糊时间序列检验。结果表明,高阶模糊时间序列产生的误差很小,因此该模型确实是一个很好的汇率预测工具。
{"title":"High order fuzzy time series for exchange rates forecasting","authors":"L. Abdullah, I. Taib","doi":"10.1109/DMO.2011.5976496","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976496","url":null,"abstract":"Fuzzy time series model has been employed by many researchers in various forecasting activities such as university enrolment, temperature, direct tax collection and the most popular stock price forecasting. However exchange rate forecasting especially using high order fuzzy time series has been given less attention despite its huge contribution in business transactions. The paper aims to test the forecasting of US dollar (USD) against Malaysian Ringgit (MYR) exchange rates using high order fuzzy time series and check its accuracy. Twenty five data set of the exchange rates USD against MYR was tested to the seven-step of high fuzzy time series. The results show that higher order fuzzy time series yield very small errors thereby the model does produce a good forecasting tool for the exchange rates.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126147929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Reducing network intrusion detection association rules using Chi-Squared pruning technique 利用Chi-Squared剪枝技术减少网络入侵检测关联规则
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976515
Ammar Fikrat Namik, Z. Othman
Increasing number of computer networks now a day has increased the effort of putting networks in secure with various attack risk. Intrusion Detection System (IDS) is a popular tool to secure network. Applying data mining has increased the quality of intrusion detection neither as anomaly detection or misused detection from large scale network traffic transaction. Association rules is a popular technique to produce a quality misused detection. However, the weaknesses of association rules is the fact that it often produced with thousands rules which reduce the performance of IDS. This paper aims to show applying post-mining to reduce the number of rules and remaining the most quality rules to produce quality signature. The experiment conducted using two data set collected from KDD Cup 99. Each data set is partitioned into 4 data sets based on type of attacks (PROB, UR2, R2L and DOS). Each partition is mining using Apriori Algorithm, which later performing post-mining using Chi-Squared (χ2) computation techniques. The quality of rules is measured based on Chi-Square value, which calculated according the support, confidence and lift of each association rule. The experiment results shows applying post-mining has reduced the rules up to 98% and remaining the quality rules.
如今,计算机网络的数量日益增加,这加大了网络安全防范各种攻击风险的努力。入侵检测系统(IDS)是一种流行的网络安全工具。数据挖掘的应用提高了入侵检测的质量,无论是作为异常检测还是大规模网络流量的误用检测。关联规则是产生质量误用检测的常用技术。然而,关联规则的弱点是它经常产生数千条规则,这降低了IDS的性能。本文旨在展示应用后挖掘来减少规则的数量,并保留最优质的规则来生成优质签名。实验采用KDD Cup 99收集的两组数据。每个数据集根据攻击类型(PROB、UR2、R2L和DOS)划分为4个数据集。每个分区使用Apriori算法进行挖掘,然后使用χ2 (χ2)计算技术进行后期挖掘。基于卡方值来衡量规则的质量,卡方值是根据每个关联规则的支持度、置信度和提升度来计算的。实验结果表明,采用后采法可减少98%的规则,保留质量规则。
{"title":"Reducing network intrusion detection association rules using Chi-Squared pruning technique","authors":"Ammar Fikrat Namik, Z. Othman","doi":"10.1109/DMO.2011.5976515","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976515","url":null,"abstract":"Increasing number of computer networks now a day has increased the effort of putting networks in secure with various attack risk. Intrusion Detection System (IDS) is a popular tool to secure network. Applying data mining has increased the quality of intrusion detection neither as anomaly detection or misused detection from large scale network traffic transaction. Association rules is a popular technique to produce a quality misused detection. However, the weaknesses of association rules is the fact that it often produced with thousands rules which reduce the performance of IDS. This paper aims to show applying post-mining to reduce the number of rules and remaining the most quality rules to produce quality signature. The experiment conducted using two data set collected from KDD Cup 99. Each data set is partitioned into 4 data sets based on type of attacks (PROB, UR2, R2L and DOS). Each partition is mining using Apriori Algorithm, which later performing post-mining using Chi-Squared (χ2) computation techniques. The quality of rules is measured based on Chi-Square value, which calculated according the support, confidence and lift of each association rule. The experiment results shows applying post-mining has reduced the rules up to 98% and remaining the quality rules.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115359149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fuzzy projective clustering in high dimension data using decrement size of data 基于数据减量的高维数据模糊投影聚类
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976521
S. Mehdi Seyednejad, hamidreza musavi, S. Mohaddese Seyednejad, Tooraj Darabi
Today, data clustering problems became an important challenge in Data Mining domain. A kind of clustering is projective clustering. Since a lot of researches has done in this article but each of previous algorithms had some defects that we will be indicate in this paper. We propose a new algorithm based on fuzzy sets and at first using this approach detect and eliminate unimportant properties for all clusters. Then we remove outliers, finally we use weighted fuzzy c-mean algorithm according to offered formula for fuzzy calculations. Experimental results show that our approach has more performance and accuracy than similar algorithms.
目前,数据聚类问题已成为数据挖掘领域的一个重要挑战。聚类的一种是投影聚类。由于本文做了大量的研究,但之前的算法都有一些缺陷,我们将在本文中指出。我们提出了一种基于模糊集的新算法,并首先使用该方法检测和消除所有聚类的不重要属性。然后去除异常值,最后根据给出的模糊计算公式使用加权模糊c均值算法。实验结果表明,与同类算法相比,该方法具有更高的性能和精度。
{"title":"Fuzzy projective clustering in high dimension data using decrement size of data","authors":"S. Mehdi Seyednejad, hamidreza musavi, S. Mohaddese Seyednejad, Tooraj Darabi","doi":"10.1109/DMO.2011.5976521","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976521","url":null,"abstract":"Today, data clustering problems became an important challenge in Data Mining domain. A kind of clustering is projective clustering. Since a lot of researches has done in this article but each of previous algorithms had some defects that we will be indicate in this paper. We propose a new algorithm based on fuzzy sets and at first using this approach detect and eliminate unimportant properties for all clusters. Then we remove outliers, finally we use weighted fuzzy c-mean algorithm according to offered formula for fuzzy calculations. Experimental results show that our approach has more performance and accuracy than similar algorithms.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123429118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gravitational search algorithm with heuristic search for clustering problems 用引力搜索算法求解启发式聚类问题
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976526
A. Hatamlou, S. Abdullah, Z. Othman
In this paper, we present an efficient algorithm for cluster analysis, which is based on gravitational search and a heuristic search algorithm. In the proposed algorithm, called GSA-HS, the gravitational search algorithm is used to find a near optimal solution for clustering problem, and then at the next step a heuristic search algorithm is applied to improve the initial solution by searching around it. Four benchmark datasets are used to evaluate and to compare the performance of the presented algorithm with two other famous clustering algorithms, i.e. K-means and particle swarm optimization algorithm. The results show that the proposed algorithm can find high quality clusters in all the tested datasets.
本文提出了一种基于引力搜索和启发式搜索的高效聚类分析算法。在GSA-HS算法中,首先使用引力搜索算法寻找聚类问题的近似最优解,然后使用启发式搜索算法对初始解进行周围搜索以改进初始解。使用4个基准数据集对本文算法与另外两种著名的聚类算法(K-means算法和粒子群优化算法)的性能进行评价和比较。结果表明,该算法能在所有测试数据集中找到高质量的聚类。
{"title":"Gravitational search algorithm with heuristic search for clustering problems","authors":"A. Hatamlou, S. Abdullah, Z. Othman","doi":"10.1109/DMO.2011.5976526","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976526","url":null,"abstract":"In this paper, we present an efficient algorithm for cluster analysis, which is based on gravitational search and a heuristic search algorithm. In the proposed algorithm, called GSA-HS, the gravitational search algorithm is used to find a near optimal solution for clustering problem, and then at the next step a heuristic search algorithm is applied to improve the initial solution by searching around it. Four benchmark datasets are used to evaluate and to compare the performance of the presented algorithm with two other famous clustering algorithms, i.e. K-means and particle swarm optimization algorithm. The results show that the proposed algorithm can find high quality clusters in all the tested datasets.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131219387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Time series similarity search based on Middle points and Clipping 基于中点和裁剪的时间序列相似性搜索
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976498
Thanh Son Nguyen, T. Duong
In this paper, we introduce a new time series dimensionality reduction method, MP_C (Middle points and Clipping). This method is performed by dividing time series into segments, some points in each segment being extracted and then these points are transformed into a sequence of bits. In our method, we choose the points in each segment by dividing a segment into sub-segments and the middle points of these sub-segments are selected. We can prove that MP_C satisfies the lower bounding condition and make MP_C indexable by showing that a time series compressed by MP_C can be indexed with the support of Skyline index. Our experiments show that our MP_C method is better than PAA in terms of tightness of lower bound and pruning power, and in similarity search, MP_C with the support of Skyline index performs faster than PAA based on traditional R*-tree.
本文提出了一种新的时间序列降维方法MP_C (Middle points and Clipping)。该方法通过将时间序列分割成段,在每段中提取一些点,然后将这些点转换成比特序列来实现。在我们的方法中,我们通过将一个段划分为子段来选择每个段中的点,并选择这些子段的中间点。通过证明MP_C压缩后的时间序列在Skyline索引的支持下可以被索引,可以证明MP_C满足下边界条件,并使MP_C可被索引。实验表明,MP_C方法在下界紧密度和剪枝能力方面优于PAA方法,在相似性搜索方面,支持Skyline索引的MP_C方法比基于传统R*树的PAA方法更快。
{"title":"Time series similarity search based on Middle points and Clipping","authors":"Thanh Son Nguyen, T. Duong","doi":"10.1109/DMO.2011.5976498","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976498","url":null,"abstract":"In this paper, we introduce a new time series dimensionality reduction method, MP_C (Middle points and Clipping). This method is performed by dividing time series into segments, some points in each segment being extracted and then these points are transformed into a sequence of bits. In our method, we choose the points in each segment by dividing a segment into sub-segments and the middle points of these sub-segments are selected. We can prove that MP_C satisfies the lower bounding condition and make MP_C indexable by showing that a time series compressed by MP_C can be indexed with the support of Skyline index. Our experiments show that our MP_C method is better than PAA in terms of tightness of lower bound and pruning power, and in similarity search, MP_C with the support of Skyline index performs faster than PAA based on traditional R*-tree.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122343185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Anomaly detection for PTM's network traffic using association rule 基于关联规则的PTM网络流量异常检测
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976506
Entisar E. Eljadi, Z. Othman
In order to evaluate the quality of UKM's NIDS, this paper presents the process of analyzing network traffic captured by Pusat Teknologi Maklumat (PTM) to detect whether it has any anomalies or not and to produce corresponding anomaly rules to be included in an update of UKM's NIDS. The network traffic data was collected using WireShark for three days, using the six most common network attributes. The experiment used three association rule data mining techniques known as Appriori, Fuzzy Appriori and FP-Growth based on two, five and ten second window slicing. Out of the four data-sets, data-sets one and two were detected to have anomalies. The results show that the Fuzzy Appriori algorithm presented the best quality result, while FP-Growth presented a faster time to reach a solution. The data-sets, which was pre-processed in the form of two second window slicing displayed better results. This research outlines the steps that can be utilized by an organization to capture and detect anomalies using association rule data mining techniques to enhance the quality their of NIDS.
为了评估UKM NIDS的质量,本文介绍了对Pusat Teknologi Maklumat (PTM)捕获的网络流量进行分析的过程,以检测其是否存在异常,并生成相应的异常规则,以包含在UKM的NIDS更新中。使用WireShark工具收集网络流量数据,收集时间为3天,使用了6种最常见的网络属性。实验使用了三种关联规则数据挖掘技术,即Appriori、模糊Appriori和基于2秒、5秒和10秒窗口切片的FP-Growth。在四个数据集中,数据集1和数据集2被检测到有异常。结果表明,Fuzzy Appriori算法的求解质量最好,FP-Growth算法的求解速度更快。以2秒窗口切片的形式进行预处理的数据集显示出更好的效果。本研究概述了组织可以使用关联规则数据挖掘技术捕获和检测异常的步骤,以提高NIDS的质量。
{"title":"Anomaly detection for PTM's network traffic using association rule","authors":"Entisar E. Eljadi, Z. Othman","doi":"10.1109/DMO.2011.5976506","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976506","url":null,"abstract":"In order to evaluate the quality of UKM's NIDS, this paper presents the process of analyzing network traffic captured by Pusat Teknologi Maklumat (PTM) to detect whether it has any anomalies or not and to produce corresponding anomaly rules to be included in an update of UKM's NIDS. The network traffic data was collected using WireShark for three days, using the six most common network attributes. The experiment used three association rule data mining techniques known as Appriori, Fuzzy Appriori and FP-Growth based on two, five and ten second window slicing. Out of the four data-sets, data-sets one and two were detected to have anomalies. The results show that the Fuzzy Appriori algorithm presented the best quality result, while FP-Growth presented a faster time to reach a solution. The data-sets, which was pre-processed in the form of two second window slicing displayed better results. This research outlines the steps that can be utilized by an organization to capture and detect anomalies using association rule data mining techniques to enhance the quality their of NIDS.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128547283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Bess or xbest: Mining the Malaysian online reviews Bess or xbest:挖掘马来西亚在线评论
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976502
Norlela Samsudin, Mazidah Puteh, A. Hamdan
Advancement in information and technology facilities especially the Internet has changed the way we communicate and express opinions or sentiments on services or products that we consume. Opinion mining aims to automate the process of mining opinions into the positive or the negative views. It will benefit both the customers and the sellers in identifying the best product or service. Although there are researchers that explore new techniques of identifying the sentiment polarization, few works have been done on opinion mining created by the Malaysian reviewers. The same scenario happens to micro-text. Therefore in this study, we conduct an exploratory research on opinion mining of online movie reviews collected from several forums and blogs written by the Malaysian. The experiment data are tested using machine learning classifiers i.e. Support VectorMachine, Naïve Baiyes and k-Nearest Neighbor. The result illustrates that the performance of these machine learning techniques without any preprocessing of the micro-texts or feature selection is quite low. Therefore additional steps are required in order to mine the opinions from these data.
信息和技术设施的进步,特别是互联网改变了我们沟通和表达对我们消费的服务或产品的意见或情感的方式。观点挖掘旨在将观点挖掘成积极或消极观点的过程自动化。这将有利于客户和卖家在确定最好的产品或服务。虽然有研究人员探索了识别情感两极分化的新技术,但很少有关于马来西亚评论者创建的意见挖掘的工作。同样的情况也发生在微文本上。因此,在本研究中,我们对从马来西亚人撰写的几个论坛和博客中收集的在线电影评论进行了意见挖掘的探索性研究。实验数据使用机器学习分类器进行测试,即Support VectorMachine, Naïve Baiyes和k-Nearest Neighbor。结果表明,在没有对微文本进行预处理或特征选择的情况下,这些机器学习技术的性能很低。因此,需要采取额外的步骤,以便从这些数据中挖掘意见。
{"title":"Bess or xbest: Mining the Malaysian online reviews","authors":"Norlela Samsudin, Mazidah Puteh, A. Hamdan","doi":"10.1109/DMO.2011.5976502","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976502","url":null,"abstract":"Advancement in information and technology facilities especially the Internet has changed the way we communicate and express opinions or sentiments on services or products that we consume. Opinion mining aims to automate the process of mining opinions into the positive or the negative views. It will benefit both the customers and the sellers in identifying the best product or service. Although there are researchers that explore new techniques of identifying the sentiment polarization, few works have been done on opinion mining created by the Malaysian reviewers. The same scenario happens to micro-text. Therefore in this study, we conduct an exploratory research on opinion mining of online movie reviews collected from several forums and blogs written by the Malaysian. The experiment data are tested using machine learning classifiers i.e. Support VectorMachine, Naïve Baiyes and k-Nearest Neighbor. The result illustrates that the performance of these machine learning techniques without any preprocessing of the micro-texts or feature selection is quite low. Therefore additional steps are required in order to mine the opinions from these data.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128562024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Scatter search for solving the course timetabling problem 散点搜索法求解课程排课问题
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976530
Ghaith M. Jaradat, M. Ayob
Scatter Search (SS) is an evolutionary population-based metaheuristic that has been successfully applied to hard combinatorial optimization problems. In contrast to the genetic algorithm, it reduces the population of solutions size into a promising set of solutions in terms of quality and diversity to maintain a balance between diversification and intensification of the search. Also it avoids using random sampling mechanisms such as crossover and mutation in generating new solutions. Instead, it performs a crossover in the form of structured solution combinations based on two good quality and diverse solutions. In this study, we propose a SS approach for solving the course timetabling problem. The approach focuses on two main methods employed within it; the reference set update and solution combination methods. Both methods provide a deterministic search process by maintaining diversity of the population. This is achieved by manipulating a dynamic population size and performing a probabilistic selection procedure in order to generate a promising reference set (elite solutions). It is also interesting to incorporate an Iterated Local Search routine into the SS method to increase the exploitation of generated good quality solutions effectively to escape from local optima and to decrease the computational time. Experimental results showed that our SS approach produces good quality solutions, and outperforms some results reported in the literature (regarding Socha's instances) including population-based algorithms.
散点搜索是一种基于进化种群的元启发式算法,已成功地应用于复杂的组合优化问题。与遗传算法相比,它在质量和多样性方面将解决方案的总体大小减少为一组有希望的解决方案,以保持搜索的多样化和集约化之间的平衡。同时避免了在生成新解时使用交叉、变异等随机抽样机制。相反,它以基于两个高质量和多样化的解决方案的结构化解决方案组合的形式进行交叉。在本研究中,我们提出一种SS方法来解决课程排课问题。该方法侧重于其中采用的两种主要方法;参考集更新和求解组合方法。这两种方法都通过保持种群的多样性来提供确定性的搜索过程。这是通过操纵动态人口规模和执行概率选择程序来实现的,以便生成一个有希望的参考集(精英解决方案)。将迭代局部搜索例程合并到SS方法中也很有趣,可以有效地提高生成的高质量解的利用率,从而避免局部最优并减少计算时间。实验结果表明,我们的SS方法产生了高质量的解决方案,并且优于文献中报道的一些结果(关于Socha的实例),包括基于人口的算法。
{"title":"Scatter search for solving the course timetabling problem","authors":"Ghaith M. Jaradat, M. Ayob","doi":"10.1109/DMO.2011.5976530","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976530","url":null,"abstract":"Scatter Search (SS) is an evolutionary population-based metaheuristic that has been successfully applied to hard combinatorial optimization problems. In contrast to the genetic algorithm, it reduces the population of solutions size into a promising set of solutions in terms of quality and diversity to maintain a balance between diversification and intensification of the search. Also it avoids using random sampling mechanisms such as crossover and mutation in generating new solutions. Instead, it performs a crossover in the form of structured solution combinations based on two good quality and diverse solutions. In this study, we propose a SS approach for solving the course timetabling problem. The approach focuses on two main methods employed within it; the reference set update and solution combination methods. Both methods provide a deterministic search process by maintaining diversity of the population. This is achieved by manipulating a dynamic population size and performing a probabilistic selection procedure in order to generate a promising reference set (elite solutions). It is also interesting to incorporate an Iterated Local Search routine into the SS method to increase the exploitation of generated good quality solutions effectively to escape from local optima and to decrease the computational time. Experimental results showed that our SS approach produces good quality solutions, and outperforms some results reported in the literature (regarding Socha's instances) including population-based algorithms.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116806423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2011 3rd Conference on Data Mining and Optimization (DMO)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1