首页 > 最新文献

2011 3rd Conference on Data Mining and Optimization (DMO)最新文献

英文 中文
Comparison of various Wiener model identification approach in modelling nonlinear process 各种维纳模型辨识方法在非线性过程建模中的比较
Pub Date : 2011-08-04 DOI: 10.1109/DMO.2011.5976517
Imam Mujahidin Iqbal, N. Aziz
An accurate and simple model is essential to implement a model based controller. Wiener model is one of the simplest nonlinear models that can represent any nonlinear process. However, in Wiener Model development, there are several identification approaches available and need to be selected to produce the most accurate model. In this work, the nonlinear - linear approach, the linear - nonlinear approach, and the simultaneous approach are compared in identification of the Wiener model for nonlinear pH neutralization process. The parameters of linear block and the inverse of nonlinear block were obtained from several sets of data that are generated. These approaches are then compared in terms of model accuracy, calculation time, data requirement, and their flexibility.
精确、简单的模型是实现基于模型的控制器的必要条件。维纳模型是可以表示任何非线性过程的最简单的非线性模型之一。然而,在维纳模型开发中,有几种可用的识别方法,需要选择以产生最准确的模型。本文比较了非线性-线性方法、线性-非线性方法和同步方法在识别非线性pH中和过程的Wiener模型中的应用。从生成的几组数据中得到线性块和非线性块的逆参数。然后从模型精度、计算时间、数据需求和灵活性方面对这些方法进行比较。
{"title":"Comparison of various Wiener model identification approach in modelling nonlinear process","authors":"Imam Mujahidin Iqbal, N. Aziz","doi":"10.1109/DMO.2011.5976517","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976517","url":null,"abstract":"An accurate and simple model is essential to implement a model based controller. Wiener model is one of the simplest nonlinear models that can represent any nonlinear process. However, in Wiener Model development, there are several identification approaches available and need to be selected to produce the most accurate model. In this work, the nonlinear - linear approach, the linear - nonlinear approach, and the simultaneous approach are compared in identification of the Wiener model for nonlinear pH neutralization process. The parameters of linear block and the inverse of nonlinear block were obtained from several sets of data that are generated. These approaches are then compared in terms of model accuracy, calculation time, data requirement, and their flexibility.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126551809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Frequent pattern using Multiple Attribute Value for itemset generation 使用多属性值生成项目集的频繁模式
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976503
Zalizah Awang Long, A. Bakar, Abdul Razak Hamdan
Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. While Association Rules Mining (ARM) algorithm especially the Apriori algorithm has been an active research work in recent years. Diverse improvement varies in term of producing more frequent items and also generating further k-length. The idea is to produce better pattern and more interesting rules. In this paper, we propose new approach for ARM based on Multiple Attribute Value within the non-binary search spaces. The proposed algorithm improves the existing frequent pattern mining by generating the most frequent values (item) within the attribute and generate candidate based on the frequent attribute value. The main idea of our work is to discover more meaningful frequent items and maximum k-length items. The experimental results show that our proposed MAV frequent pattern mining enhance the impact in generating more frequents items and maximum length
数据挖掘是在大型关系数据库中的数十个字段之间寻找相关性或模式的过程。而关联规则挖掘(ARM)算法特别是Apriori算法是近年来研究的热点。不同的改进在产生更频繁的物品和产生更多的k长度方面有所不同。其想法是产生更好的模式和更有趣的规则。本文提出了一种基于非二进制搜索空间中多属性值的ARM算法。该算法通过生成属性内最频繁的值(项)来改进现有的频繁模式挖掘,并基于频繁属性值生成候选模式。我们工作的主要思想是发现更多有意义的频繁项和最大k长度项。实验结果表明,我们提出的MAV频繁模式挖掘在生成更多的频率项和最大长度方面具有增强的效果
{"title":"Frequent pattern using Multiple Attribute Value for itemset generation","authors":"Zalizah Awang Long, A. Bakar, Abdul Razak Hamdan","doi":"10.1109/DMO.2011.5976503","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976503","url":null,"abstract":"Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. While Association Rules Mining (ARM) algorithm especially the Apriori algorithm has been an active research work in recent years. Diverse improvement varies in term of producing more frequent items and also generating further k-length. The idea is to produce better pattern and more interesting rules. In this paper, we propose new approach for ARM based on Multiple Attribute Value within the non-binary search spaces. The proposed algorithm improves the existing frequent pattern mining by generating the most frequent values (item) within the attribute and generate candidate based on the frequent attribute value. The main idea of our work is to discover more meaningful frequent items and maximum k-length items. The experimental results show that our proposed MAV frequent pattern mining enhance the impact in generating more frequents items and maximum length","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122540991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Modeling forest fires risk using spatial decision tree 基于空间决策树的森林火灾风险建模
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976512
R. Yaakob, N. Mustapha, A. Nuruddin, I. S. Sitanggang
Forest fires have long been annual events in many parts of Sumatra Indonesia during the dry season. Riau Province is one of the regions in Sumatra where forest fires seriously occur every year mostly because of human factors both on purposes and accidently. Forest fire models have been developed for certain area using the weightage and criterion of variables that involve the subjective and qualitative judging for variables. Determining the weights for each criterion is based on expert knowledge or the previous experienced of the developers that may result too subjective models. In addition, criteria evaluation and weighting method are most applied to evaluate the small problem containing few criteria. This paper presents our initial work in developing a spatial decision tree using the spatial ID3 algorithm and Spatial Join Index applied in the SCART (Spatial Classification and Regression Trees) algorithm. The algorithm is applied on historic forest fires data for a district in Riau namely Rokan Hilir to develop a model for forest fires risk. The modeling forest fire risk includes variables related to physical as well as social and economic. The result is a spatial decision tree containing 138 leaves with distance to nearest river as the first test attribute.
长期以来,印尼苏门答腊岛的许多地区每年旱季都会发生森林火灾。廖内省是苏门答腊岛每年发生森林火灾最严重的地区之一,主要是人为因素造成的,有故意的,也有意外的。利用变量的权重和准则建立了特定区域的森林火灾模型,其中涉及对变量的主观判断和定性判断。确定每个标准的权重是基于专家知识或开发人员以前的经验,这可能导致过于主观的模型。另外,标准评价法和加权法多用于评价标准较少的小问题。本文介绍了我们在使用空间ID3算法和应用于SCART(空间分类和回归树)算法的空间连接索引开发空间决策树方面的初步工作。将该算法应用于廖内省罗干希利尔地区的历史森林火灾数据,建立了森林火灾风险模型。森林火灾风险建模包括与物理以及社会和经济相关的变量。结果是一个包含138个叶子的空间决策树,到最近河流的距离作为第一个测试属性。
{"title":"Modeling forest fires risk using spatial decision tree","authors":"R. Yaakob, N. Mustapha, A. Nuruddin, I. S. Sitanggang","doi":"10.1109/DMO.2011.5976512","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976512","url":null,"abstract":"Forest fires have long been annual events in many parts of Sumatra Indonesia during the dry season. Riau Province is one of the regions in Sumatra where forest fires seriously occur every year mostly because of human factors both on purposes and accidently. Forest fire models have been developed for certain area using the weightage and criterion of variables that involve the subjective and qualitative judging for variables. Determining the weights for each criterion is based on expert knowledge or the previous experienced of the developers that may result too subjective models. In addition, criteria evaluation and weighting method are most applied to evaluate the small problem containing few criteria. This paper presents our initial work in developing a spatial decision tree using the spatial ID3 algorithm and Spatial Join Index applied in the SCART (Spatial Classification and Regression Trees) algorithm. The algorithm is applied on historic forest fires data for a district in Riau namely Rokan Hilir to develop a model for forest fires risk. The modeling forest fire risk includes variables related to physical as well as social and economic. The result is a spatial decision tree containing 138 leaves with distance to nearest river as the first test attribute.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128739178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating Integrated Weight Linear method to class imbalanced learning in video data 评价视频数据中班级不平衡学习的综合权重线性方法
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976535
Zainal Apandi, N. Mustapha, L. S. Affendey
With the enormous amount of video data especially with the existence of the noisy and irrelevant information, it would be difficult for a typical detection process to capture a small portion of targeted due to the class imbalance problem. In this paper, class imbalance referred to a very small percentage of positive instance versus negative instances, where the negative instances dominate the detection model, resulting in the degradation of the detection performance. This paper proposed an Integrated Weight Linear (IWL) method that integrate weight linear algorithm (WL) with principle component analysis (PCA) to eliminate imbalanced dataset in soccer video data. PCA is adopted in the first phase with the aim to alleviates the imbalanced data and prepared the reduced instances to the next phase. In the second phase, the reduces instances are refined using the weight linear algorithm. The experiment results using 9 soccer video demonstrate that the integration of PCA and WL is capable to alleviates the imbalanced problem and able to improve classification performance in video data.
由于视频数据量巨大,特别是存在噪声和不相关信息,由于类不平衡问题,典型的检测过程很难捕捉到一小部分目标。在本文中,类不平衡是指正实例与负实例的比例非常小,其中负实例主导了检测模型,导致检测性能下降。提出了一种将权重线性算法(WL)与主成分分析(PCA)相结合的加权线性(IWL)方法来消除足球视频数据中的不平衡数据集。在第一阶段采用主成分分析法,目的是为了缓解数据的不平衡,并为下一阶段准备减少的实例。在第二阶段,使用加权线性算法对约简实例进行细化。以9个足球视频为例的实验结果表明,PCA与WL的结合能够缓解视频数据的不平衡问题,提高视频数据的分类性能。
{"title":"Evaluating Integrated Weight Linear method to class imbalanced learning in video data","authors":"Zainal Apandi, N. Mustapha, L. S. Affendey","doi":"10.1109/DMO.2011.5976535","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976535","url":null,"abstract":"With the enormous amount of video data especially with the existence of the noisy and irrelevant information, it would be difficult for a typical detection process to capture a small portion of targeted due to the class imbalance problem. In this paper, class imbalance referred to a very small percentage of positive instance versus negative instances, where the negative instances dominate the detection model, resulting in the degradation of the detection performance. This paper proposed an Integrated Weight Linear (IWL) method that integrate weight linear algorithm (WL) with principle component analysis (PCA) to eliminate imbalanced dataset in soccer video data. PCA is adopted in the first phase with the aim to alleviates the imbalanced data and prepared the reduced instances to the next phase. In the second phase, the reduces instances are refined using the weight linear algorithm. The experiment results using 9 soccer video demonstrate that the integration of PCA and WL is capable to alleviates the imbalanced problem and able to improve classification performance in video data.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129871820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A greedy constructive approach for Nurse Rostering Problem 护士名册问题的贪婪建设性方法
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976532
Mouna Jamom, M. Ayob, Mohammed Hadwan
Nurse Rostering Problem (NRP) concerns about producing a high quality workable duty roster for the available staff nurses. The aim of this work is to present a greedy constructive heuristic algorithm to generate a feasible initial solution by satisfying the hard constraints. Basically the initial solution includes three steps: first we start by designing a group of shift patterns based on hard and soft constraints. Then, those patterns are rotated for predefined positions and allocated to each nurse. Finally; if the solution is not feasible we use a repair mechanism. In this work, a real world problem from Universiti Kebangsaan Malaysia Medical Centre (UKMMC) is used to test the proposed algorithm. The resulting roster demonstrates that our proposed algorithm generates a good quality duty roster in a reasonable computational time for our case study.
护士名册问题(NRP)关注的是如何为现有的护士编制高质量、可操作的值班名册。本文的目的是提出一种贪婪构造启发式算法,通过满足硬约束来生成可行的初始解。基本上,最初的解决方案包括三个步骤:首先,我们基于硬约束和软约束设计一组转换模式。然后,这些模式被旋转到预定义的位置,并分配给每个护士。最后;如果解决方案不可行,我们就使用修复机制。在这项工作中,来自马来西亚Kebangsaan大学医学中心(UKMMC)的一个现实世界问题被用来测试所提出的算法。结果表明,我们提出的算法在合理的计算时间内为我们的案例研究生成了高质量的值班名册。
{"title":"A greedy constructive approach for Nurse Rostering Problem","authors":"Mouna Jamom, M. Ayob, Mohammed Hadwan","doi":"10.1109/DMO.2011.5976532","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976532","url":null,"abstract":"Nurse Rostering Problem (NRP) concerns about producing a high quality workable duty roster for the available staff nurses. The aim of this work is to present a greedy constructive heuristic algorithm to generate a feasible initial solution by satisfying the hard constraints. Basically the initial solution includes three steps: first we start by designing a group of shift patterns based on hard and soft constraints. Then, those patterns are rotated for predefined positions and allocated to each nurse. Finally; if the solution is not feasible we use a repair mechanism. In this work, a real world problem from Universiti Kebangsaan Malaysia Medical Centre (UKMMC) is used to test the proposed algorithm. The resulting roster demonstrates that our proposed algorithm generates a good quality duty roster in a reasonable computational time for our case study.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"326 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123312508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A semi-cyclic shift patterns approach for nurse rostering problems 护士名册问题的半循环轮班模式方法
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976525
Mohammed Hadwan, M. Ayob
The paper, at hand, introduces a semi-cyclic shift patterns approach (SCSPA) that solves nurse rostering problem (NRP) at the Medical Centre, Universiti Kebangsaan Malaysia (UKMMC). Since night shift is the most problematic shift to assign due to the extra constraints that it has, the paper proposes a combination of semi-cyclic approach, which first allocates a predesigned night shift patterns cyclically, then allocates a combined morning and evening shift patterns in a non-cyclic manner until fulfilling the hard constraints. This is different from our previous work that adopted a non-cyclic shift pattern approach (NCSPA) to construct all of the possible valid shift patterns, which were a combination of morning, evening and night shifts which were incorporated to yield one-week shift patterns. Next, two shift patterns of one-week were allocated for each nurse until construct the initial roster. This paper presents a comparison between the proposed semi-cyclic approach and the previous non-cyclic approach. Beside the minimum violation penalty, we count the number of good patterns that each algorithm produces in order to measure the quality of constructed duty roster. Then, the approach applies simulated annealing algorithm in order to improve the overall produced roster as to enhance the initial roster that resulted from both algorithms. By using a semi-cyclic approach, two benefits over our previous work are gained, (i) the number of constructed shift patterns decreased remarkably, thus reduces the construction time; and (ii) allocating night shift patterns fairly for all nurses becomes more manageable. Based on the obtained results, the semi-cyclic approach yields a better duty roster as it produces more good patterns compared to our previous Non-cyclic approach.
该论文,在手头,介绍了半循环移位模式方法(SCSPA),解决护士名册问题(NRP)在医学中心,马来西亚Kebangsaan大学(UKMMC)。由于夜班是最有问题的轮班分配,由于它有额外的约束,本文提出了半循环方法的组合,首先循环地分配一个预先设计的夜班模式,然后以非循环的方式分配一个组合的早晚轮班模式,直到满足硬约束。这与我们之前的工作不同,我们采用非循环轮班模式方法(NCSPA)来构建所有可能的有效轮班模式,即早上,晚上和夜班的组合,这些组合被纳入到一周的轮班模式中。接下来,为每个护士分配两个为期一周的轮班模式,直到建立初始花名册。本文将所提出的半循环方法与以前的非循环方法进行了比较。除了最小违例处罚外,我们还计算了每个算法产生的好模式的数量,以衡量构建的值班表的质量。然后,该方法采用模拟退火算法来改进生成的总体花名册,以增强两种算法得到的初始花名册。通过使用半循环方法,与我们以前的工作相比,获得了两个好处,(i)构建移位模式的数量显着减少,从而减少了构建时间;(2)为所有护士公平分配夜班模式变得更容易管理。根据所获得的结果,与之前的非循环方法相比,半循环方法产生了更好的任务表,因为它产生了更多好的模式。
{"title":"A semi-cyclic shift patterns approach for nurse rostering problems","authors":"Mohammed Hadwan, M. Ayob","doi":"10.1109/DMO.2011.5976525","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976525","url":null,"abstract":"The paper, at hand, introduces a semi-cyclic shift patterns approach (SCSPA) that solves nurse rostering problem (NRP) at the Medical Centre, Universiti Kebangsaan Malaysia (UKMMC). Since night shift is the most problematic shift to assign due to the extra constraints that it has, the paper proposes a combination of semi-cyclic approach, which first allocates a predesigned night shift patterns cyclically, then allocates a combined morning and evening shift patterns in a non-cyclic manner until fulfilling the hard constraints. This is different from our previous work that adopted a non-cyclic shift pattern approach (NCSPA) to construct all of the possible valid shift patterns, which were a combination of morning, evening and night shifts which were incorporated to yield one-week shift patterns. Next, two shift patterns of one-week were allocated for each nurse until construct the initial roster. This paper presents a comparison between the proposed semi-cyclic approach and the previous non-cyclic approach. Beside the minimum violation penalty, we count the number of good patterns that each algorithm produces in order to measure the quality of constructed duty roster. Then, the approach applies simulated annealing algorithm in order to improve the overall produced roster as to enhance the initial roster that resulted from both algorithms. By using a semi-cyclic approach, two benefits over our previous work are gained, (i) the number of constructed shift patterns decreased remarkably, thus reduces the construction time; and (ii) allocating night shift patterns fairly for all nurses becomes more manageable. Based on the obtained results, the semi-cyclic approach yields a better duty roster as it produces more good patterns compared to our previous Non-cyclic approach.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115927871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A genetic based wrapper feature selection approach using Nearest Neighbour Distance Matrix 基于最近邻距离矩阵的遗传包装特征选择方法
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976534
M. Sainin, R. Alfred
Feature selection for data mining optimization receives quite a high demand especially on high-dimensional feature vectors of a data. Feature selection is a method used to select the best feature (or combination of features) for the data in order to achieve similar or better classification rate. Currently, there are three types of feature selection methods: filter, wrapper and embedded. This paper describes a genetic based wrapper approach that optimizes feature selection process embedded in a classification technique called a supervised Nearest Neighbour Distance Matrix (NNDM). This method is implemented and tested on several datasets obtained from the UCI Machine Learning Repository and other datasets. The results demonstrate a significant impact on the predictive accuracy for feature selection combined with the supervised NNDM in classifying new instances. Therefore it can be used in other applications that require feature dimension reduction such as image and bioinformatics classifications.
数据挖掘优化中的特征选择有很高的要求,特别是对数据的高维特征向量的选择。特征选择是一种为数据选择最佳特征(或特征组合)以达到相似或更好分类率的方法。目前,特征选择方法主要有三种:过滤、包装和嵌入。本文描述了一种基于遗传的包装方法,该方法优化了嵌入在一种称为监督最近邻距离矩阵(NNDM)的分类技术中的特征选择过程。该方法在从UCI机器学习存储库和其他数据集获得的几个数据集上实现和测试。结果表明,特征选择与监督NNDM相结合对新实例分类的预测精度有显著影响。因此,它可以用于其他需要特征降维的应用,如图像和生物信息学分类。
{"title":"A genetic based wrapper feature selection approach using Nearest Neighbour Distance Matrix","authors":"M. Sainin, R. Alfred","doi":"10.1109/DMO.2011.5976534","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976534","url":null,"abstract":"Feature selection for data mining optimization receives quite a high demand especially on high-dimensional feature vectors of a data. Feature selection is a method used to select the best feature (or combination of features) for the data in order to achieve similar or better classification rate. Currently, there are three types of feature selection methods: filter, wrapper and embedded. This paper describes a genetic based wrapper approach that optimizes feature selection process embedded in a classification technique called a supervised Nearest Neighbour Distance Matrix (NNDM). This method is implemented and tested on several datasets obtained from the UCI Machine Learning Repository and other datasets. The results demonstrate a significant impact on the predictive accuracy for feature selection combined with the supervised NNDM in classifying new instances. Therefore it can be used in other applications that require feature dimension reduction such as image and bioinformatics classifications.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133544555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Data mining technique for expertise search in a special interest group knowledge portal 特殊兴趣群体知识门户中专业知识搜索的数据挖掘技术
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976499
Wan Muhammad Zulhafizsyam Wan Ahmad, S. Sulaiman, U. K. Yusof
The Internet contributes to the development of electronic community (e-community) portals. Such portals become an indispensable platform for members especially for a Special Interest Groups (SIG) to share knowledge and expertise in their respective fields. Finding expertise over the e-community portal will help interested people and researchers to identify other experts, working in the same area. However, it is quite a cumbersome task to search such expertise in the portal. In order to find an expert, expertise data mining could be a solution to ease the search of experts. Performing effective data mining technique will help to analyze and measure expertise level accurately in a SIG portal. This paper proposes a method called Expertise Data Mining (EDM) that comprises a few techniques for expertise search in a SIG portal. It expects to improve the finding of experts among the members of a SIG e-community.
互联网促进了电子社区门户网站的发展。这些门户网站成为成员,特别是特殊兴趣小组(SIG)成员在各自领域分享知识和专业知识的不可或缺的平台。在电子社区门户网站上寻找专业知识将有助于感兴趣的人和研究人员找到在同一领域工作的其他专家。然而,在门户中搜索此类专业知识是一项相当繁琐的任务。为了找到专家,专家数据挖掘可以成为一种简化专家搜索的解决方案。执行有效的数据挖掘技术将有助于在SIG门户中准确地分析和测量专业水平。本文提出了一种专业知识数据挖掘(EDM)方法,该方法包含了SIG门户中专业知识搜索的几种技术。它希望改善SIG电子社区成员中专家的发现。
{"title":"Data mining technique for expertise search in a special interest group knowledge portal","authors":"Wan Muhammad Zulhafizsyam Wan Ahmad, S. Sulaiman, U. K. Yusof","doi":"10.1109/DMO.2011.5976499","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976499","url":null,"abstract":"The Internet contributes to the development of electronic community (e-community) portals. Such portals become an indispensable platform for members especially for a Special Interest Groups (SIG) to share knowledge and expertise in their respective fields. Finding expertise over the e-community portal will help interested people and researchers to identify other experts, working in the same area. However, it is quite a cumbersome task to search such expertise in the portal. In order to find an expert, expertise data mining could be a solution to ease the search of experts. Performing effective data mining technique will help to analyze and measure expertise level accurately in a SIG portal. This paper proposes a method called Expertise Data Mining (EDM) that comprises a few techniques for expertise search in a SIG portal. It expects to improve the finding of experts among the members of a SIG e-community.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114219550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability apriori based approach to mine rare association rules 基于概率先验的稀有关联规则挖掘方法
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976537
Sandeep Singh Rawat, L. Rajamani
It is a difficult task to set rare association rules to handle unpredictable items since approaches such as apriori algorithm and frequent pattern-growth, a single minimum support application based suffers from low or high minimum support. If minimum support is set high to cover the rarely appearing items it will miss the frequent patterns involving rare items since rare items fail to satisfy high minimum support. In the literature, an effort has been made to extract rare association rules with multiple minimum supports. In this paper, we explore the probability and propose multiple minsup based apriori-like approach called Probability Apriori Multiple Minimum Support (PAMMS) to efficiently discover rare association rules. Experimental results show that the proposed approach is efficient.
设置罕见的关联规则来处理不可预测的项目是一项困难的任务,因为诸如apriori算法和频繁的模式增长等方法,基于单个最小支持应用程序的最小支持度或高或低。如果将最低支持设置为高以覆盖很少出现的项目,它将错过涉及稀有项目的频繁模式,因为稀有项目无法满足高最低支持。在文献中,已经努力提取具有多个最小支持度的稀有关联规则。在本文中,我们探索了概率,提出了基于多个minsup的类先验方法,称为概率Apriori多重最小支持(probability Apriori multiple Minimum Support, PAMMS)来有效地发现罕见关联规则。实验结果表明,该方法是有效的。
{"title":"Probability apriori based approach to mine rare association rules","authors":"Sandeep Singh Rawat, L. Rajamani","doi":"10.1109/DMO.2011.5976537","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976537","url":null,"abstract":"It is a difficult task to set rare association rules to handle unpredictable items since approaches such as apriori algorithm and frequent pattern-growth, a single minimum support application based suffers from low or high minimum support. If minimum support is set high to cover the rarely appearing items it will miss the frequent patterns involving rare items since rare items fail to satisfy high minimum support. In the literature, an effort has been made to extract rare association rules with multiple minimum supports. In this paper, we explore the probability and propose multiple minsup based apriori-like approach called Probability Apriori Multiple Minimum Support (PAMMS) to efficiently discover rare association rules. Experimental results show that the proposed approach is efficient.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126106109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Low-level Teamwork Hybridization for P-metaheuristics: A review and comparison p -元启发式的低水平团队杂交:综述与比较
Pub Date : 2011-06-28 DOI: 10.1109/DMO.2011.5976516
S. Masrom, Siti Z.Z. Abidin, P. N. Hashimah, S. A.S. Abd. Rahman
Inspired by nature, many types of Population based metaheuristics or P-metaheuristics is cropping out of research labs to help solve real life problems. Since every metaheuristics has its own strength and weaknesses, hybridizing the algorithms can sometimes produce better results. To this date of literature, Low-level Teamwork Hybridization is considered as an effective and popular method for hybridization of P-metaheuristics. In many cases however, the approach might prove to be quite complicated. The hybridization often requires metaheuristics internal structure modification in order for the different algorithms to fit well together. Another difficulty is in determining which strategies to be retained and which to be dropped or replaced in each of the metaheuristic algorithms. This paper provides a general abstraction for P-metaheuristics and describes the main P-metaheuristics components that are suitable candidates for hybridization. The review and comparative study of several implementations of Low-level Teamwork Hybridization is also presented.
受大自然的启发,许多类型的基于人口的元启发式或p -元启发式正在从研究实验室中脱颖而出,以帮助解决现实生活中的问题。由于每个元启发式算法都有自己的优点和缺点,因此混合算法有时可以产生更好的结果。到目前为止,低水平团队杂交被认为是p -元启发式杂交的一种有效和流行的方法。然而,在许多情况下,这种方法可能被证明是相当复杂的。杂交往往需要元启发式内部结构修改,以使不同的算法很好地适应在一起。另一个困难是确定在每个元启发式算法中保留哪些策略,放弃或替换哪些策略。本文给出了p -元启发式的一般抽象,并描述了适合杂交的主要p -元启发式成分。本文还对几种低层次团队杂交的实现方法进行了综述和比较研究。
{"title":"Low-level Teamwork Hybridization for P-metaheuristics: A review and comparison","authors":"S. Masrom, Siti Z.Z. Abidin, P. N. Hashimah, S. A.S. Abd. Rahman","doi":"10.1109/DMO.2011.5976516","DOIUrl":"https://doi.org/10.1109/DMO.2011.5976516","url":null,"abstract":"Inspired by nature, many types of Population based metaheuristics or P-metaheuristics is cropping out of research labs to help solve real life problems. Since every metaheuristics has its own strength and weaknesses, hybridizing the algorithms can sometimes produce better results. To this date of literature, Low-level Teamwork Hybridization is considered as an effective and popular method for hybridization of P-metaheuristics. In many cases however, the approach might prove to be quite complicated. The hybridization often requires metaheuristics internal structure modification in order for the different algorithms to fit well together. Another difficulty is in determining which strategies to be retained and which to be dropped or replaced in each of the metaheuristic algorithms. This paper provides a general abstraction for P-metaheuristics and describes the main P-metaheuristics components that are suitable candidates for hybridization. The review and comparative study of several implementations of Low-level Teamwork Hybridization is also presented.","PeriodicalId":436393,"journal":{"name":"2011 3rd Conference on Data Mining and Optimization (DMO)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126525345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2011 3rd Conference on Data Mining and Optimization (DMO)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1