首页 > 最新文献

2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)最新文献

英文 中文
Empirical comparison of correlation measures and pruning levels in complex networks representing the global climate system 代表全球气候系统的复杂网络中相关测度和修剪水平的实证比较
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949305
Alex Pelan, K. Steinhaeuser, N. Chawla, D. Pitts, A. Ganguly
Climate change is an issue of growing economic, social, and political concern. Continued rise in the average temperatures of the Earth could lead to drastic climate change or an increased frequency of extreme events, which would negatively affect agriculture, population, and global health. One way of studying the dynamics of the Earth's changing climate is by attempting to identify regions that exhibit similar climatic behavior in terms of long-term variability. Climate networks have emerged as a strong analytics framework for both descriptive analysis and predictive modeling of the emergent phenomena. Previously, the networks were constructed using only one measure of similarity, namely the (linear) Pearson cross correlation, and were then clustered using a community detection algorithm. However, nonlinear dependencies are known to exist in climate, which begs the question whether more complex correlation measures are able to capture any such relationships. In this paper, we present a systematic study of different univariate measures of similarity and compare how each affects both the network structure as well as the predictive power of the clusters.
气候变化是一个日益引起经济、社会和政治关注的问题。地球平均温度的持续上升可能导致剧烈的气候变化或极端事件的频率增加,这将对农业、人口和全球健康产生负面影响。研究地球气候变化动力学的一种方法是试图找出在长期变化方面表现出相似气候行为的地区。气候网络已经成为一个强大的分析框架,用于对出现的现象进行描述性分析和预测建模。以前,网络仅使用一种相似性度量,即(线性)Pearson交叉相关来构建,然后使用社区检测算法进行聚类。然而,已知气候中存在非线性依赖关系,这就引出了一个问题,即更复杂的相关度量是否能够捕捉到任何此类关系。在本文中,我们对不同的单变量相似性度量进行了系统的研究,并比较了每种度量如何影响网络结构以及聚类的预测能力。
{"title":"Empirical comparison of correlation measures and pruning levels in complex networks representing the global climate system","authors":"Alex Pelan, K. Steinhaeuser, N. Chawla, D. Pitts, A. Ganguly","doi":"10.1109/CIDM.2011.5949305","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949305","url":null,"abstract":"Climate change is an issue of growing economic, social, and political concern. Continued rise in the average temperatures of the Earth could lead to drastic climate change or an increased frequency of extreme events, which would negatively affect agriculture, population, and global health. One way of studying the dynamics of the Earth's changing climate is by attempting to identify regions that exhibit similar climatic behavior in terms of long-term variability. Climate networks have emerged as a strong analytics framework for both descriptive analysis and predictive modeling of the emergent phenomena. Previously, the networks were constructed using only one measure of similarity, namely the (linear) Pearson cross correlation, and were then clustered using a community detection algorithm. However, nonlinear dependencies are known to exist in climate, which begs the question whether more complex correlation measures are able to capture any such relationships. In this paper, we present a systematic study of different univariate measures of similarity and compare how each affects both the network structure as well as the predictive power of the clusters.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123809216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
User-guided discovery of declarative process models 用户引导的声明性流程模型发现
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949297
F. Maggi, A. Mooij, Wil M.P. van der Aalst
Process mining techniques can be used to effectively discover process models from logs with example behaviour. Cross-correlating a discovered model with information in the log can be used to improve the underlying process. However, existing process discovery techniques have two important drawbacks. The produced models tend to be large and complex, especially in flexible environments where process executions involve multiple alternatives. This “overload” of information is caused by the fact that traditional discovery techniques construct procedural models explicitly showing all possible behaviours. Moreover, existing techniques offer limited possibilities to guide the mining process towards specific properties of interest. These problems can be solved by discovering declarative models. Using a declarative model, the discovered process behaviour is described as a (compact) set of rules. Moreover, the discovery of such models can easily be guided in terms of rule templates. This paper uses DECLARE, a declarative language that provides more flexibility than conventional procedural notations such as BPMN, Petri nets, UML ADs, EPCs and BPEL. We present an approach to automatically discover DECLARE models. This has been implemented in the process mining tool ProM. Our approach and toolset have been applied to a case study provided by the company Thales in the domain of maritime safety and security.
过程挖掘技术可用于从具有示例行为的日志中有效地发现过程模型。将发现的模型与日志中的信息交叉关联可用于改进底层流程。然而,现有的流程发现技术有两个重要的缺点。生成的模型往往又大又复杂,特别是在流程执行涉及多个备选方案的灵活环境中。这种信息的“过载”是由传统发现技术构建的程序模型明确显示所有可能的行为所造成的。此外,现有的技术提供了有限的可能性来指导挖掘过程到特定的感兴趣的属性。这些问题可以通过发现声明性模型来解决。使用声明性模型,发现的流程行为被描述为一组(紧凑的)规则。此外,这些模型的发现可以很容易地根据规则模板进行指导。本文使用DECLARE,这是一种声明性语言,它比传统的过程符号(如BPMN、Petri网、UML ad、epc和BPEL)提供了更多的灵活性。我们提出了一种自动发现DECLARE模型的方法。这已经在过程挖掘工具ProM中实现。我们的方法和工具集已应用于泰雷兹公司在海上安全和安保领域提供的案例研究。
{"title":"User-guided discovery of declarative process models","authors":"F. Maggi, A. Mooij, Wil M.P. van der Aalst","doi":"10.1109/CIDM.2011.5949297","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949297","url":null,"abstract":"Process mining techniques can be used to effectively discover process models from logs with example behaviour. Cross-correlating a discovered model with information in the log can be used to improve the underlying process. However, existing process discovery techniques have two important drawbacks. The produced models tend to be large and complex, especially in flexible environments where process executions involve multiple alternatives. This “overload” of information is caused by the fact that traditional discovery techniques construct procedural models explicitly showing all possible behaviours. Moreover, existing techniques offer limited possibilities to guide the mining process towards specific properties of interest. These problems can be solved by discovering declarative models. Using a declarative model, the discovered process behaviour is described as a (compact) set of rules. Moreover, the discovery of such models can easily be guided in terms of rule templates. This paper uses DECLARE, a declarative language that provides more flexibility than conventional procedural notations such as BPMN, Petri nets, UML ADs, EPCs and BPEL. We present an approach to automatically discover DECLARE models. This has been implemented in the process mining tool ProM. Our approach and toolset have been applied to a case study provided by the company Thales in the domain of maritime safety and security.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122322019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 152
Local neighbourhood extension of SMOTE for mining imbalanced data 非平衡数据挖掘SMOTE的局部邻域扩展
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949434
Tomasz Maciejewski, J. Stefanowski
In this paper we discuss problems of inducing classifiers from imbalanced data and improving recognition of minority class using focused resampling techniques. We are particularly interested in SMOTE over-sampling method that generates new synthetic examples from the minority class between the closest neighbours from this class. However, SMOTE could also overgeneralize the minority class region as it does not consider distribution of other neighbours from the majority classes. Therefore, we introduce a new generalization of SMOTE, called LN-SMOTE, which exploits more precisely information about the local neighbourhood of the considered examples. In the experiments we compare this method with original SMOTE and its two, the most related, other generalizations Borderline and Safe-Level SMOTE. All these pre-processing methods are applied together with either decision tree or Naive Bayes classifiers. The results show that the new LN-SMOTE method improves evaluation measures for the minority class.
本文讨论了从不平衡数据中引入分类器和利用聚焦重采样技术提高对少数类的识别问题。我们对SMOTE过采样方法特别感兴趣,该方法从该类中最近邻之间的少数类中生成新的合成示例。然而,SMOTE也可能过度概括少数阶级地区,因为它没有考虑来自多数阶级的其他邻居的分布。因此,我们引入了SMOTE的一种新的泛化,称为LN-SMOTE,它更精确地利用了所考虑示例的局部邻域信息。在实验中,我们将该方法与原始SMOTE及其两个最相关的其他概括Borderline和Safe-Level SMOTE进行了比较。所有这些预处理方法都与决策树或朴素贝叶斯分类器一起应用。结果表明,新的nn - smote方法改进了少数民族班级的评价措施。
{"title":"Local neighbourhood extension of SMOTE for mining imbalanced data","authors":"Tomasz Maciejewski, J. Stefanowski","doi":"10.1109/CIDM.2011.5949434","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949434","url":null,"abstract":"In this paper we discuss problems of inducing classifiers from imbalanced data and improving recognition of minority class using focused resampling techniques. We are particularly interested in SMOTE over-sampling method that generates new synthetic examples from the minority class between the closest neighbours from this class. However, SMOTE could also overgeneralize the minority class region as it does not consider distribution of other neighbours from the majority classes. Therefore, we introduce a new generalization of SMOTE, called LN-SMOTE, which exploits more precisely information about the local neighbourhood of the considered examples. In the experiments we compare this method with original SMOTE and its two, the most related, other generalizations Borderline and Safe-Level SMOTE. All these pre-processing methods are applied together with either decision tree or Naive Bayes classifiers. The results show that the new LN-SMOTE method improves evaluation measures for the minority class.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127071678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 218
Data mining driven agents for predicting online auction's end price 数据挖掘驱动代理预测在线拍卖最终价格
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949427
Preetinder Kaur, M. Goyal, Jie Lu
Auctions can be characterized by distinct nature of their feature space. This feature space may include opening price, closing price, average bid rate, bid history, seller and buyer reputation, number of bids and many more. In this paper, a clustering based method is used to forecast the end-price of an online auction for autonomous agent based system. In the proposed model, the input auction space is partitioned into groups of similar auctions by k-means clustering algorithm. The recurrent problem of finding the value of k in k-means algorithm is solved by employing elbow method using one way analysis of variance (ANOVA). Then k numbers of regression models are employed to estimate the forecasted price of an online auction. Based on the transformed data after clustering and the characteristics of the current auction, bid selector nominates the regression model for the current auction whose price is to be forecasted. Our results show the improvements in the end price prediction for each cluster which support in favor of the proposed clustering based model for the bid prediction in the online auction environment.
拍卖可以由其特征空间的不同性质来表征。这个特征空间可能包括开盘价、收盘价、平均投标率、投标历史、卖方和买方声誉、投标数量等等。本文采用基于聚类的方法对基于自主代理的在线拍卖系统的最终价格进行预测。在该模型中,通过k-means聚类算法将输入拍卖空间划分为相似的拍卖组。采用单向方差分析(ANOVA)的肘部法解决了k-means算法中反复出现的求k值问题。然后使用k个回归模型来估计在线拍卖的预测价格。根据聚类后的变换数据和当前拍卖的特征,竞价选择器为待预测价格的当前拍卖指定回归模型。我们的结果表明,每个聚类的最终价格预测都有所改进,这支持了基于聚类的在线拍卖环境下的出价预测模型。
{"title":"Data mining driven agents for predicting online auction's end price","authors":"Preetinder Kaur, M. Goyal, Jie Lu","doi":"10.1109/CIDM.2011.5949427","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949427","url":null,"abstract":"Auctions can be characterized by distinct nature of their feature space. This feature space may include opening price, closing price, average bid rate, bid history, seller and buyer reputation, number of bids and many more. In this paper, a clustering based method is used to forecast the end-price of an online auction for autonomous agent based system. In the proposed model, the input auction space is partitioned into groups of similar auctions by k-means clustering algorithm. The recurrent problem of finding the value of k in k-means algorithm is solved by employing elbow method using one way analysis of variance (ANOVA). Then k numbers of regression models are employed to estimate the forecasted price of an online auction. Based on the transformed data after clustering and the characteristics of the current auction, bid selector nominates the regression model for the current auction whose price is to be forecasted. Our results show the improvements in the end price prediction for each cluster which support in favor of the proposed clustering based model for the bid prediction in the online auction environment.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121122185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Computational intelligence methods for processing misaligned, unevenly sampled time series containing missing data 处理包含缺失数据的不对齐、不均匀采样时间序列的计算智能方法
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949447
F. Cismondi, André S. Fialho, S. Vieira, J. Sousa, S. Reti, M. Howell, S. Finkelstein
One consequence of the increasing amount of data stored during acquisition processes is that sampled time series are more prone to be collected in a misaligned uneven fashion and/or be partly lost or unavailable (missing data). Due to their severe impact on data mining techniques, this work proposes methods to (a) align misaligned unevenly sampled data, (b) differentiate absent values related to low sampling frequencies, compared to those resulting from missingness mechanisms, and (c) to classify recoverable and non-recoverable segments of missing data by using statistical and fuzzy modeling approaches. These methods were evaluated against randomly simulated test datasets containing different amounts of missing data. Results show that: (1) using the variable most frequently sampled as a template, combined with cubic interpolation, allowed to unshift misaligned uneven data without significant errors; (2) the differentiation of absent values due to low sampling frequencies from those truly missing, can be succesfully performed using 95% confidence intervals relative to the mean sampling time; (3) fuzzy modeling returned better classification results for recoverable segments, while the statistical approach performed better in classifying non-recoverable segments. All three methods proposed in this work decreased their performance when the amount of missing data was increased in the test datasets.
在采集过程中存储的数据量不断增加的一个后果是,采样时间序列更容易以不对齐的不均匀方式收集和/或部分丢失或不可用(丢失数据)。由于它们对数据挖掘技术的严重影响,本工作提出了以下方法:(a)对齐未对齐的不均匀采样数据;(b)与缺失机制造成的缺失值相比,区分与低采样频率相关的缺失值;(c)通过统计和模糊建模方法对缺失数据的可恢复和不可恢复部分进行分类。这些方法对随机模拟的测试数据集进行了评估,这些数据集包含不同数量的缺失数据。结果表明:(1)以采样频率最高的变量为模板,结合三次插值,可以在不显著误差的情况下对不均匀数据进行偏移;(2)使用相对于平均采样时间的95%置信区间,可以成功地将低采样频率导致的缺失值与真正缺失值区分开来;(3)模糊建模对可恢复段的分类效果较好,而统计方法对不可恢复段的分类效果较好。当测试数据集中缺失数据的数量增加时,本文提出的三种方法的性能都会下降。
{"title":"Computational intelligence methods for processing misaligned, unevenly sampled time series containing missing data","authors":"F. Cismondi, André S. Fialho, S. Vieira, J. Sousa, S. Reti, M. Howell, S. Finkelstein","doi":"10.1109/CIDM.2011.5949447","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949447","url":null,"abstract":"One consequence of the increasing amount of data stored during acquisition processes is that sampled time series are more prone to be collected in a misaligned uneven fashion and/or be partly lost or unavailable (missing data). Due to their severe impact on data mining techniques, this work proposes methods to (a) align misaligned unevenly sampled data, (b) differentiate absent values related to low sampling frequencies, compared to those resulting from missingness mechanisms, and (c) to classify recoverable and non-recoverable segments of missing data by using statistical and fuzzy modeling approaches. These methods were evaluated against randomly simulated test datasets containing different amounts of missing data. Results show that: (1) using the variable most frequently sampled as a template, combined with cubic interpolation, allowed to unshift misaligned uneven data without significant errors; (2) the differentiation of absent values due to low sampling frequencies from those truly missing, can be succesfully performed using 95% confidence intervals relative to the mean sampling time; (3) fuzzy modeling returned better classification results for recoverable segments, while the statistical approach performed better in classifying non-recoverable segments. All three methods proposed in this work decreased their performance when the amount of missing data was increased in the test datasets.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116033706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Using gaming strategies for attacker and defender in recommender systems 在推荐系统中使用攻击者和防御者的游戏策略
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949304
J. Zhan, Lijo Thomas, Venkata Pasumarthi
Ratings are the prominent factors to decide the fate of any product in the present Internet Market and many people follow the ratings in a genuine sense. Unfortunately, the Sibyl attacks can affect the credibility of the genuine product. Influence limiter algorithms in recommender systems have been used extensively to overcome the Sibyl attacks but the effort could not reach the safe mark. This paper highlights an approach to generating gaming strategies for the attacker and defender in a recommender system. In a given recommender system environment, attackers and defenders play the most crucial part in a gaming strategy. A sequence of decision rules that an attacker or defender may use to achieve their desired goal is represented in these strategies involved in the game theory. The valid approaches to avoid the Sibyl attacks from the attackers are efficiently defended by the defenders. In our approach, we define attack graphs, use cases, and misuses cases in our gaming framework to analyze the vulnerabilities and security measures incorporated in a recommender system.
在当今的互联网市场上,评级是决定任何产品命运的重要因素,许多人在真正意义上遵循评级。不幸的是,Sibyl攻击会影响正品的可信度。在推荐系统中,影响限制算法已被广泛用于克服Sibyl攻击,但仍未达到安全标准。本文重点研究了一种在推荐系统中为攻击者和防御者生成博弈策略的方法。在给定的推荐系统环境中,攻击者和防御者在游戏策略中扮演着最重要的角色。在博弈论的策略中,攻击者或防御者可能使用一系列决策规则来实现他们的预期目标。有效避免攻击者的Sibyl攻击的方法被防御者有效地防御。在我们的方法中,我们在游戏框架中定义攻击图、用例和误用案例,以分析推荐系统中包含的漏洞和安全措施。
{"title":"Using gaming strategies for attacker and defender in recommender systems","authors":"J. Zhan, Lijo Thomas, Venkata Pasumarthi","doi":"10.1109/CIDM.2011.5949304","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949304","url":null,"abstract":"Ratings are the prominent factors to decide the fate of any product in the present Internet Market and many people follow the ratings in a genuine sense. Unfortunately, the Sibyl attacks can affect the credibility of the genuine product. Influence limiter algorithms in recommender systems have been used extensively to overcome the Sibyl attacks but the effort could not reach the safe mark. This paper highlights an approach to generating gaming strategies for the attacker and defender in a recommender system. In a given recommender system environment, attackers and defenders play the most crucial part in a gaming strategy. A sequence of decision rules that an attacker or defender may use to achieve their desired goal is represented in these strategies involved in the game theory. The valid approaches to avoid the Sibyl attacks from the attackers are efficiently defended by the defenders. In our approach, we define attack graphs, use cases, and misuses cases in our gaming framework to analyze the vulnerabilities and security measures incorporated in a recommender system.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114601792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generating materialized views using ant based approaches and information retrieval technologies 使用基于蚁群的方法和信息检索技术生成物化视图
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949302
H. Drias
In this paper, a hybrid system combining ant based approaches and tabu search has been designed for the generation of materialized views in a relational data warehouse environment with the purpose of improving the queries performance. Two ACO algorithms were adapted for the views generation problem to take up the scalability challenge and information retrieval technologies are used in the search process. In addition, our approach manages dynamically the storage to include the best views determined by the bio-inspired approach. Experiments have been conducted to validate the designed algorithms and interesting performance is observed when comparing it with those of the previous related works.
本文设计了一种结合蚁群算法和禁忌搜索的混合系统,用于关系数据仓库环境中物化视图的生成,以提高查询性能。在视图生成问题中采用了两种蚁群算法来解决可扩展性问题,并在搜索过程中采用了信息检索技术。此外,我们的方法动态管理存储,以包含由生物启发方法确定的最佳视图。实验验证了所设计的算法,并将其与之前的相关工作进行了比较,观察到有趣的性能。
{"title":"Generating materialized views using ant based approaches and information retrieval technologies","authors":"H. Drias","doi":"10.1109/CIDM.2011.5949302","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949302","url":null,"abstract":"In this paper, a hybrid system combining ant based approaches and tabu search has been designed for the generation of materialized views in a relational data warehouse environment with the purpose of improving the queries performance. Two ACO algorithms were adapted for the views generation problem to take up the scalability challenge and information retrieval technologies are used in the search process. In addition, our approach manages dynamically the storage to include the best views determined by the bio-inspired approach. Experiments have been conducted to validate the designed algorithms and interesting performance is observed when comparing it with those of the previous related works.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114665143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Enhancing precision in Process Conformance: Stability, confidence and severity 提高工艺一致性的精度:稳定性、信心和严谨性
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949451
J. Munoz-Gama, J. Carmona
Process Conformance is becoming a crucial area due to the changing nature of processes within an Information System. By confronting specifications against system executions (the main problem tackled in process conformance), both system bugs and obsolete/incorrect specifications can be revealed. This paper presents novel techniques to enrich the process conformance analysis for the precision dimension. The new features of the metric proposed in this paper provides a complete view of the precision between a log and a model. The techniques have been implemented as a plug-in in an open-source Process Mining platform and experimental results witnessing both the theory and the goals of this work are presented.
由于信息系统中过程性质的变化,过程一致性正成为一个至关重要的领域。通过将规范与系统执行(过程一致性中处理的主要问题)进行比较,可以揭示系统错误和过时/不正确的规范。本文提出了丰富精密尺寸工艺一致性分析的新技术。本文提出的度量的新特征提供了一个完整的视图的精度之间的日志和模型。这些技术已经作为插件在一个开源的过程挖掘平台上实现,并给出了实验结果,证明了这项工作的理论和目标。
{"title":"Enhancing precision in Process Conformance: Stability, confidence and severity","authors":"J. Munoz-Gama, J. Carmona","doi":"10.1109/CIDM.2011.5949451","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949451","url":null,"abstract":"Process Conformance is becoming a crucial area due to the changing nature of processes within an Information System. By confronting specifications against system executions (the main problem tackled in process conformance), both system bugs and obsolete/incorrect specifications can be revealed. This paper presents novel techniques to enrich the process conformance analysis for the precision dimension. The new features of the metric proposed in this paper provides a complete view of the precision between a log and a model. The techniques have been implemented as a plug-in in an open-source Process Mining platform and experimental results witnessing both the theory and the goals of this work are presented.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130591829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
Sectors on sectors (SonS): A new hierarchical clustering visualization tool 扇区上扇区(SonS):一种新的分层聚类可视化工具
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949448
J. Martínez-Martínez, Pablo Escandell-Montero, E. Soria-Olivas, J. Martín-Guerrero, M. Martínez-Sober, J. Gómez-Sanchís
Clustering techniques have been widely applied to extract information from high-dimensional data structures in the last few years. Graphs are especially relevant for clustering, but many graphs associated with hierarchical clustering do not give any information about the values of the centroids' attributes and the relationships among them. In this paper, we propose a new visualization approach for hierarchical cluster analysis in which the above-mentioned information is available. The method is based on pie charts. The pie charts are divided into several pie segments or sectors corresponding to each cluster. The radius of each pie segment is proportional to the number of patterns included in each cluster. By means of new divisions in each pie sector and a color bar with as many labels as attributes, we can extract all the existing relationships among centroids' attributes at any hierarchy level. The methodology is tested in one synthetic data set and one real data set. Achieved results show the suitability and usefulness of the proposed approach.
近年来,聚类技术被广泛应用于从高维数据结构中提取信息。图与聚类特别相关,但是许多与分层聚类相关的图没有给出关于质心属性值和它们之间关系的任何信息。在本文中,我们提出了一种新的可视化方法用于分层聚类分析,其中可以获得上述信息。该方法基于饼状图。饼状图被分成几个饼段或扇区,对应于每个集群。每个饼形段的半径与每个簇中包含的模式数量成正比。通过在每个扇区中进行新的划分,并在颜色条中添加尽可能多的标签作为属性,我们可以在任何层次结构中提取质心属性之间的所有现有关系。在一个合成数据集和一个真实数据集上对该方法进行了测试。所取得的结果表明了所提出方法的适用性和有效性。
{"title":"Sectors on sectors (SonS): A new hierarchical clustering visualization tool","authors":"J. Martínez-Martínez, Pablo Escandell-Montero, E. Soria-Olivas, J. Martín-Guerrero, M. Martínez-Sober, J. Gómez-Sanchís","doi":"10.1109/CIDM.2011.5949448","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949448","url":null,"abstract":"Clustering techniques have been widely applied to extract information from high-dimensional data structures in the last few years. Graphs are especially relevant for clustering, but many graphs associated with hierarchical clustering do not give any information about the values of the centroids' attributes and the relationships among them. In this paper, we propose a new visualization approach for hierarchical cluster analysis in which the above-mentioned information is available. The method is based on pie charts. The pie charts are divided into several pie segments or sectors corresponding to each cluster. The radius of each pie segment is proportional to the number of patterns included in each cluster. By means of new divisions in each pie sector and a color bar with as many labels as attributes, we can extract all the existing relationships among centroids' attributes at any hierarchy level. The methodology is tested in one synthetic data set and one real data set. Achieved results show the suitability and usefulness of the proposed approach.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131020493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A banner recommendation system based on web navigation history 基于网页导航历史的banner推荐系统
Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949437
G. Giuffrida, D. Recupero, Giuseppe Tribulato, C. Zarba
We address the problem of selecting a banner advertisement, based on the profile of the online user. The profile consists of the set of webpages opened by the online user, optionally clustered.
我们解决了选择横幅广告的问题,基于在线用户的配置文件。配置文件由在线用户打开的一组网页组成,可选地聚集在一起。
{"title":"A banner recommendation system based on web navigation history","authors":"G. Giuffrida, D. Recupero, Giuseppe Tribulato, C. Zarba","doi":"10.1109/CIDM.2011.5949437","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949437","url":null,"abstract":"We address the problem of selecting a banner advertisement, based on the profile of the online user. The profile consists of the set of webpages opened by the online user, optionally clustered.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128970965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1