Journal of Artificial Intelligence and Data Mining最新文献_第6页

Probabilistic Reasoning and Markov Chains as Means to Improve Performance of Tuning Decisions under Uncertainty 概率推理和马尔可夫链作为提高不确定性下调整决策性能的手段

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-09-09 DOI: 10.22044/JADM.2020.8920.2027

A. Omondi, I. A. Lukandu, G. Wanyembi

Variable environmental conditions and runtime phenomena require developers of complex business information systems to expose configuration parameters to system administrators. This allows system administrators to intervene by tuning the bottleneck configuration parameters in response to current changes or in anticipation of future changes in order to maintain the system’s performance at an optimum level. However, these manual performance tuning interventions are prone to error and lack of standards due to fatigue, varying levels of expertise and over-reliance on inaccurate predictions of future states of a business information system. As a result, the purpose of this research is to investigate on how the capacity of probabilistic reasoning to handle uncertainty can be combined with the capacity of Markov chains to map stochastic environmental phenomena to ideal self-optimization actions. This was done using a comparative experimental research design that involved quantitative data collection through simulations of different algorithm variants. This provided compelling results that indicate that applying the algorithm in a distributed database system improves performance of tuning decisions under uncertainty. The improvement was quantitatively measured by a response-time latency that was 27% lower than average and a transaction throughput that was 17% higher than average.

可变的环境条件和运行时现象要求复杂业务信息系统的开发人员向系统管理员公开配置参数。这允许系统管理员通过调整瓶颈配置参数来进行干预，以响应当前的更改或预期未来的更改，从而将系统的性能保持在最佳水平。然而，由于疲劳、专业知识水平不同以及过度依赖对业务信息系统未来状态的不准确预测，这些手动性能调整干预措施容易出现错误和缺乏标准。因此，本研究的目的是研究如何将概率推理处理不确定性的能力与马尔可夫链将随机环境现象映射到理想自优化行动的能力相结合。这是使用比较实验研究设计完成的，该设计涉及通过模拟不同算法变体来收集定量数据。这提供了令人信服的结果，表明在分布式数据库系统中应用该算法可以提高在不确定性下调整决策的性能。通过比平均值低27%的响应时间延迟和比平均值高17%的事务吞吐量来定量测量改进。

{"title":"Probabilistic Reasoning and Markov Chains as Means to Improve Performance of Tuning Decisions under Uncertainty","authors":"A. Omondi, I. A. Lukandu, G. Wanyembi","doi":"10.22044/JADM.2020.8920.2027","DOIUrl":"https://doi.org/10.22044/JADM.2020.8920.2027","url":null,"abstract":"Variable environmental conditions and runtime phenomena require developers of complex business information systems to expose configuration parameters to system administrators. This allows system administrators to intervene by tuning the bottleneck configuration parameters in response to current changes or in anticipation of future changes in order to maintain the system’s performance at an optimum level. However, these manual performance tuning interventions are prone to error and lack of standards due to fatigue, varying levels of expertise and over-reliance on inaccurate predictions of future states of a business information system. As a result, the purpose of this research is to investigate on how the capacity of probabilistic reasoning to handle uncertainty can be combined with the capacity of Markov chains to map stochastic environmental phenomena to ideal self-optimization actions. This was done using a comparative experimental research design that involved quantitative data collection through simulations of different algorithm variants. This provided compelling results that indicate that applying the algorithm in a distributed database system improves performance of tuning decisions under uncertainty. The improvement was quantitatively measured by a response-time latency that was 27% lower than average and a transaction throughput that was 17% higher than average.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43465033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A New Incentive Mechanism to Detect and Restrict Sybil Nodes in P2P File-Sharing Networks with a Heterogeneous Bandwidth 异构带宽P2P文件共享网络中Sybil节点检测与约束的新激励机制

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-21 DOI: 10.22044/JADM.2020.9063.2049

M. Shareh, H. Navidi, H. Javadi, M. Hosseinzadeh

In cooperative P2P networks, there are two kinds of illegal users, namely free riders and Sybils. Free riders are those who try to receive services without any sort of cost. Sybil users are rational peers which have multiple fake identities. There are some techniques to detect free riders and Sybil users which have previously been proposed by a number of researchers such as the Tit-for-tat and Sybil guard techniques. Although such previously proposed techniques were quite successful in detecting free riders and Sybils individually, there is no technique capable of detecting both these riders simultaneously. Therefore, the main objective of this research is to propose a single mechanism to detect both kinds of these illegal users based on Game theory. Obtaining new centrality and bandwidth contribution formulas with an incentive mechanism approach is the basic idea of the present research’s proposed solution. The result of this paper shows that as the life of the network passes, free riders are identified, and through detecting Sybil nodes, the number of services offered to them will be decreased.

在合作P2P网络中，存在两种非法用户，即搭便车者和西比尔人。免费乘车者是那些试图免费获得服务的人。Sybil用户是理性的同行，他们有多个虚假身份。许多研究人员以前提出了一些检测搭便车者和西比尔用户的技术，如针锋相对和西比尔守卫技术。尽管之前提出的这些技术在单独检测自由骑手和西比尔时非常成功，但没有能够同时检测这两个骑手的技术。因此，本研究的主要目的是基于博弈论提出一种单一的机制来检测这两种非法用户。利用激励机制方法获得新的中心性和带宽贡献公式是本研究提出的解决方案的基本思想。本文的结果表明，随着网络寿命的推移，免费骑手被识别出来，通过检测Sybil节点，向他们提供的服务数量将减少。

{"title":"A New Incentive Mechanism to Detect and Restrict Sybil Nodes in P2P File-Sharing Networks with a Heterogeneous Bandwidth","authors":"M. Shareh, H. Navidi, H. Javadi, M. Hosseinzadeh","doi":"10.22044/JADM.2020.9063.2049","DOIUrl":"https://doi.org/10.22044/JADM.2020.9063.2049","url":null,"abstract":"In cooperative P2P networks, there are two kinds of illegal users, namely free riders and Sybils. Free riders are those who try to receive services without any sort of cost. Sybil users are rational peers which have multiple fake identities. There are some techniques to detect free riders and Sybil users which have previously been proposed by a number of researchers such as the Tit-for-tat and Sybil guard techniques. Although such previously proposed techniques were quite successful in detecting free riders and Sybils individually, there is no technique capable of detecting both these riders simultaneously. Therefore, the main objective of this research is to propose a single mechanism to detect both kinds of these illegal users based on Game theory. Obtaining new centrality and bandwidth contribution formulas with an incentive mechanism approach is the basic idea of the present research’s proposed solution. The result of this paper shows that as the life of the network passes, free riders are identified, and through detecting Sybil nodes, the number of services offered to them will be decreased.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43514905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Combinatorial Algorithm for Fuzzy Parameter Estimation with Application to Uncertain Measurements 模糊参数估计的组合算法及其在不确定测量中的应用

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-21 DOI: 10.22044/JADM.2020.8610.1996

M. Danesh, S. Danesh

This paper presents a new method for regression model prediction in an uncertain environment. In practical engineering problems, in order to develop regression or ANN model for making predictions, the average of set of repeated observed values are introduced to the model as an input variable. Therefore, the estimated response of the process is also the average of a set of output values where the variation around the mean is not determinate. However, to provide unbiased and precise estimations, the predictions are required to be correct on average and the spread of date be specified. To address this issue, we proposed a method based on the fuzzy inference system, and genetic and linear programming algorithms. We consider the crisp inputs and the symmetrical triangular fuzzy output. The proposed algorithm is applied to fit the fuzzy regression model. In addition, we apply a simulation example and a practical example in the field of machining process to assess the performance of the proposed method in dealing with practical problems in which the output variables have the nature of uncertainty and impression. Finally, we compare the performance of the suggested method with other methods. Based on the examples, the proposed method is verified for prediction. The results show that the proposed method reduces the error values to a minimum level and is more accurate than the Linear Programming (LP) and fuzzy weights with linear programming (FWLP) methods.

提出了一种不确定环境下回归模型预测的新方法。在实际工程问题中，为了建立回归模型或人工神经网络模型进行预测，将重复观测值集的平均值作为输入变量引入模型。因此，该过程的估计响应也是一组输出值的平均值，其中平均值周围的变化是不确定的。然而，为了提供无偏和精确的估计，预测要求平均正确，并指定日期的范围。为了解决这个问题，我们提出了一种基于模糊推理系统、遗传和线性规划算法的方法。我们考虑了清晰输入和对称三角模糊输出。将该算法应用于模糊回归模型的拟合。此外，通过仿真算例和机械加工领域的实际算例，对该方法在处理输出变量具有不确定性和印象性的实际问题时的性能进行了评价。最后，我们将该方法与其他方法的性能进行了比较。通过实例验证了该方法的预测效果。结果表明，该方法将误差值减小到最小，并且比线性规划方法和模糊权重法具有更高的精度。

{"title":"A Combinatorial Algorithm for Fuzzy Parameter Estimation with Application to Uncertain Measurements","authors":"M. Danesh, S. Danesh","doi":"10.22044/JADM.2020.8610.1996","DOIUrl":"https://doi.org/10.22044/JADM.2020.8610.1996","url":null,"abstract":"This paper presents a new method for regression model prediction in an uncertain environment. In practical engineering problems, in order to develop regression or ANN model for making predictions, the average of set of repeated observed values are introduced to the model as an input variable. Therefore, the estimated response of the process is also the average of a set of output values where the variation around the mean is not determinate. However, to provide unbiased and precise estimations, the predictions are required to be correct on average and the spread of date be specified. To address this issue, we proposed a method based on the fuzzy inference system, and genetic and linear programming algorithms. We consider the crisp inputs and the symmetrical triangular fuzzy output. The proposed algorithm is applied to fit the fuzzy regression model. In addition, we apply a simulation example and a practical example in the field of machining process to assess the performance of the proposed method in dealing with practical problems in which the output variables have the nature of uncertainty and impression. Finally, we compare the performance of the suggested method with other methods. Based on the examples, the proposed method is verified for prediction. The results show that the proposed method reduces the error values to a minimum level and is more accurate than the Linear Programming (LP) and fuzzy weights with linear programming (FWLP) methods.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41726480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data 模糊数据聚类的模糊C均值算法及其在不完全数据聚类中的应用

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-13 DOI: 10.22044/JADM.2020.9021.2038

J. Tayyebi, E. Hosseinzadeh

The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is presented to cluster incomplete fuzzy data. The method substitutes missing attribute by a trapezoidal fuzzy number to be determined by using the corresponding attribute of q nearest-neighbor. Comparisons and analysis of the experimental results demonstrate the capability of the proposed method.

模糊c-均值聚类算法是一种有用的聚类工具；但它只对清晰完整的数据很方便。本文提出了一种适用于梯形模糊数据聚类的改进算法。线性排序函数用于定义梯形模糊数据的距离。然后，作为一个应用，提出了一种基于该算法的不完全模糊数据聚类方法。该方法用梯形模糊数代替缺失的属性，通过使用q最近邻的相应属性来确定。实验结果的比较和分析证明了该方法的有效性。

引用次数: 3

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters 一种基于蒙特卡罗的性能调整参数降维搜索策略

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-05 DOI: 10.22044/JADM.2020.9403.2076

A. Omondi, I. A. Lukandu, G. Wanyembi

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of relevant features and the exploration of features that have the potential to be relevant. In doing so, the study evaluated how effective the manipulation of the search component in feature selection is on achieving high accuracy with reduced dimensions. A control group experimental design was used to observe factual evidence. The context of the experiment was the high dimensional data experienced in performance tuning of complex database systems. The Wilcoxon signed-rank test at .05 level of significance was used to compare repeated classification accuracy measurements on the independent experiment and control group samples. Encouraging results with a p-value < 0.05 were recorded and provided evidence to reject the null hypothesis in favour of the alternative hypothesis which states that meta-heuristic search approaches are effective in achieving high accuracy with reduced dimensions depending on the outcome variable under investigation.

高维数据中的冗余和不相关特征增加了底层数学模型的复杂性。为了降低数据的维数，有必要进行搜索最相关特征的预处理步骤。这项研究使用了元启发式搜索方法，该方法使用轻量级随机模拟来平衡相关特征的开发和潜在相关特征的探索。在这样做的过程中，该研究评估了特征选择中搜索组件的操作在降低维度的情况下实现高精度方面的有效性。对照组实验设计用于观察事实证据。实验的背景是在复杂数据库系统的性能调整中所经历的高维数据。使用0.05显著性水平的Wilcoxon符号秩检验来比较独立实验和对照组样本的重复分类准确性测量。记录了p值<0.05的令人鼓舞的结果，并提供了拒绝零假设而支持替代假设的证据，替代假设指出元启发式搜索方法在根据调查的结果变量降低维度的情况下有效地实现了高精度。

{"title":"A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters","authors":"A. Omondi, I. A. Lukandu, G. Wanyembi","doi":"10.22044/JADM.2020.9403.2076","DOIUrl":"https://doi.org/10.22044/JADM.2020.9403.2076","url":null,"abstract":"Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of relevant features and the exploration of features that have the potential to be relevant. In doing so, the study evaluated how effective the manipulation of the search component in feature selection is on achieving high accuracy with reduced dimensions. A control group experimental design was used to observe factual evidence. The context of the experiment was the high dimensional data experienced in performance tuning of complex database systems. The Wilcoxon signed-rank test at .05 level of significance was used to compare repeated classification accuracy measurements on the independent experiment and control group samples. Encouraging results with a p-value < 0.05 were recorded and provided evidence to reject the null hypothesis in favour of the alternative hypothesis which states that meta-heuristic search approaches are effective in achieving high accuracy with reduced dimensions depending on the outcome variable under investigation.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42690453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development of an Ensemble Multi-stage Machine for Prediction of Breast Cancer Survivability 用于预测乳腺癌生存能力的集成多阶段机器的开发

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-01 DOI: 10.22044/JADM.2020.8406.1978

M. Salehi, J. Razmara, S. Lotfi

Prediction of cancer survivability using machine learning techniques has become a popular approach in recent years. ‎In this regard, an important issue is that preparation of some features may need conducting difficult and costly experiments while these features have less significant impacts on the final decision and can be ignored from the feature set‎. ‎Therefore‎, ‎developing a machine for prediction of survivability‎, ‎which ignores these features for simple cases and yields an acceptable prediction accuracy‎, ‎has turned into a challenge for researchers‎. ‎In this paper‎, ‎we have developed an ensemble multi-stage machine for survivability prediction which ignores difficult features for simple cases‎. ‎The machine employs three basic learners‎, ‎namely multilayer perceptron (MLP), ‎ support vector machine (SVM), and decision tree (DT)‎, ‎in the first stage to predict survivability using simple features‎. ‎If the learners agree on the output‎, ‎the machine makes the final decision in the first stage‎. Otherwise, ‎for difficult cases where the output of learners is different‎, ‎the machine makes decision in the second stage using SVM over all features‎. The developed model was evaluated using the Surveillance, Epidemiology, and End Results (SEER) database. The experimental results revealed that ‎the developed machine obtains considerable accuracy while it ignores difficult features for most of the input samples‎‎.

近年来，使用机器学习技术预测癌症生存能力已成为一种流行的方法。在这方面，一个重要的问题是，一些特征的准备可能需要进行困难和昂贵的实验，而这些特征对最终决策的影响较小，可以从特征集中忽略。因此，开发一种预测生存能力的机器，在简单的情况下忽略这些特征并产生可接受的预测精度，已经成为研究人员面临的挑战。在本文中，我们开发了一种集成多阶段机器，用于生存能力预测，忽略了简单情况下的困难特征。该机器使用了三种基本的学习器，即多层感知机(MLP)、支持向量机(SVM)和决策树(DT)，在第一阶段使用简单的特征来预测生存能力。如果学习者对输出达成一致，机器在第一阶段做出最终决定。否则，对于学习器输出不同的困难情况，机器在第二阶段使用SVM对所有特征进行决策。使用监测、流行病学和最终结果(SEER)数据库对开发的模型进行评估。实验结果表明，所开发的机器在忽略大部分输入样本的困难特征的同时，获得了相当高的精度。

{"title":"Development of an Ensemble Multi-stage Machine for Prediction of Breast Cancer Survivability","authors":"M. Salehi, J. Razmara, S. Lotfi","doi":"10.22044/JADM.2020.8406.1978","DOIUrl":"https://doi.org/10.22044/JADM.2020.8406.1978","url":null,"abstract":"Prediction of cancer survivability using machine learning techniques has become a popular approach in recent years. ‎In this regard, an important issue is that preparation of some features may need conducting difficult and costly experiments while these features have less significant impacts on the final decision and can be ignored from the feature set‎. ‎Therefore‎, ‎developing a machine for prediction of survivability‎, ‎which ignores these features for simple cases and yields an acceptable prediction accuracy‎, ‎has turned into a challenge for researchers‎. ‎In this paper‎, ‎we have developed an ensemble multi-stage machine for survivability prediction which ignores difficult features for simple cases‎. ‎The machine employs three basic learners‎, ‎namely multilayer perceptron (MLP), ‎ support vector machine (SVM), and decision tree (DT)‎, ‎in the first stage to predict survivability using simple features‎. ‎If the learners agree on the output‎, ‎the machine makes the final decision in the first stage‎. Otherwise, ‎for difficult cases where the output of learners is different‎, ‎the machine makes decision in the second stage using SVM over all features‎. The developed model was evaluated using the Surveillance, Epidemiology, and End Results (SEER) database. The experimental results revealed that ‎the developed machine obtains considerable accuracy while it ignores difficult features for most of the input samples‎‎.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43310037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

VHR Semantic Labeling by Random Forest Classification and Fusion of Spectral and Spatial Features on Google Earth Engine 谷歌地球引擎上基于随机森林分类的VHR语义标注及光谱与空间特征融合

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-01 DOI: 10.22044/JADM.2020.8252.1964

M. Kakooei, Y. Baleghi

Semantic labeling is an active field in remote sensing applications. Although handling high detailed objects in Very High Resolution (VHR) optical image and VHR Digital Surface Model (DSM) is a challenging task, it can improve the accuracy of semantic labeling methods. In this paper, a semantic labeling method is proposed by fusion of optical and normalized DSM data. Spectral and spatial features are fused into a Heterogeneous Feature Map to train the classifier. Evaluation database classes are impervious surface, building, low vegetation, tree, car, and background. The proposed method is implemented on Google Earth Engine. The method consists of several levels. First, Principal Component Analysis is applied to vegetation indexes to find maximum separable color space between vegetation and non-vegetation area. Gray Level Co-occurrence Matrix is computed to provide texture information as spatial features. Several Random Forests are trained with automatically selected train dataset. Several spatial operators follow the classification to refine the result. Leaf-Less-Tree feature is used to solve the underestimation problem in tree detection. Area, major and, minor axis of connected components are used to refine building and car detection. Evaluation shows significant improvement in tree, building, and car accuracy. Overall accuracy and Kappa coefficient are appropriate.

语义标注是遥感应用中一个活跃的领域。尽管在甚高分辨率（VHR）光学图像和VHR数字表面模型（DSM）中处理高细节对象是一项具有挑战性的任务，但它可以提高语义标记方法的准确性。本文提出了一种融合光学和归一化DSM数据的语义标记方法。将光谱和空间特征融合到异构特征图中以训练分类器。评估数据库类别包括不透水表面、建筑物、低植被、树木、汽车和背景。该方法已在谷歌地球引擎上实现。该方法由几个层次组成。首先，将主成分分析法应用于植被指数，找出植被和非植被区域之间的最大可分离颜色空间。计算灰度共生矩阵以提供纹理信息作为空间特征。几个随机森林是用自动选择的训练数据集训练的。几个空间操作符遵循分类来细化结果。无叶树特征用于解决树检测中的低估问题。连接部件的面积、长轴和短轴用于细化建筑和汽车检测。评估显示，树木、建筑和汽车精度有了显著提高。总体精度和Kappa系数是合适的。

{"title":"VHR Semantic Labeling by Random Forest Classification and Fusion of Spectral and Spatial Features on Google Earth Engine","authors":"M. Kakooei, Y. Baleghi","doi":"10.22044/JADM.2020.8252.1964","DOIUrl":"https://doi.org/10.22044/JADM.2020.8252.1964","url":null,"abstract":"Semantic labeling is an active field in remote sensing applications. Although handling high detailed objects in Very High Resolution (VHR) optical image and VHR Digital Surface Model (DSM) is a challenging task, it can improve the accuracy of semantic labeling methods. In this paper, a semantic labeling method is proposed by fusion of optical and normalized DSM data. Spectral and spatial features are fused into a Heterogeneous Feature Map to train the classifier. Evaluation database classes are impervious surface, building, low vegetation, tree, car, and background. The proposed method is implemented on Google Earth Engine. The method consists of several levels. First, Principal Component Analysis is applied to vegetation indexes to find maximum separable color space between vegetation and non-vegetation area. Gray Level Co-occurrence Matrix is computed to provide texture information as spatial features. Several Random Forests are trained with automatically selected train dataset. Several spatial operators follow the classification to refine the result. Leaf-Less-Tree feature is used to solve the underestimation problem in tree detection. Area, major and, minor axis of connected components are used to refine building and car detection. Evaluation shows significant improvement in tree, building, and car accuracy. Overall accuracy and Kappa coefficient are appropriate.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47544546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Vehicle Type Recognition based on Dimension Estimation and Bag of Word Classification 基于尺寸估计和词袋分类的车型识别

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-01 DOI: 10.22044/JADM.2020.8375.1975

R. A. Dehkordi, H. Khosravi

Fine-grained vehicle type recognition is one of the main challenges in machine vision. Almost all of the ways presented so far have identified the type of vehicle with the help of feature extraction and classifiers. Because of the apparent similarity between car classes, these methods may produce erroneous results. This paper presents a methodology that uses two criteria to identify common vehicle types. The first criterion is feature extraction and classification and the second criterion is to use the dimensions of car for classification. This method consists of three phases. In the first phase, the coordinates of the vanishing points are obtained. In the second phase, the bounding box and dimensions are calculated for each passing vehicle. Finally, in the third phase, the exact vehicle type is determined by combining the results of the first and second criteria. To evaluate the proposed method, a dataset of images and videos, prepared by the authors, has been used. This dataset is recorded from places similar to those of a roadside camera. Most existing methods use high-quality images for evaluation and are not applicable in the real world, but in the proposed method real-world video frames are used to determine the exact type of vehicle, and the accuracy of 89.5% is achieved, which represents a good performance.

细粒度车辆类型识别是机器视觉的主要挑战之一。到目前为止，几乎所有提出的方法都是借助特征提取和分类器来识别车辆的类型。由于汽车类之间明显的相似性，这些方法可能产生错误的结果。本文提出了一种使用两个标准来识别常见车辆类型的方法。第一个准则是特征提取和分类，第二个准则是利用汽车的尺寸进行分类。该方法包括三个阶段。在第一阶段，得到消失点的坐标。在第二阶段，计算每辆经过车辆的边界框和尺寸。最后，在第三阶段，结合第一个和第二个标准的结果确定确切的车型。为了评估所提出的方法，使用了作者准备的图像和视频数据集。这个数据集是从类似于路边摄像头的地方记录的。现有的方法大多采用高质量的图像进行评估，并不适用于真实世界，但该方法采用真实世界的视频帧来确定车辆的确切类型，准确率达到89.5%，表现出较好的性能。

{"title":"Vehicle Type Recognition based on Dimension Estimation and Bag of Word Classification","authors":"R. A. Dehkordi, H. Khosravi","doi":"10.22044/JADM.2020.8375.1975","DOIUrl":"https://doi.org/10.22044/JADM.2020.8375.1975","url":null,"abstract":"Fine-grained vehicle type recognition is one of the main challenges in machine vision. Almost all of the ways presented so far have identified the type of vehicle with the help of feature extraction and classifiers. Because of the apparent similarity between car classes, these methods may produce erroneous results. This paper presents a methodology that uses two criteria to identify common vehicle types. The first criterion is feature extraction and classification and the second criterion is to use the dimensions of car for classification. This method consists of three phases. In the first phase, the coordinates of the vanishing points are obtained. In the second phase, the bounding box and dimensions are calculated for each passing vehicle. Finally, in the third phase, the exact vehicle type is determined by combining the results of the first and second criteria. To evaluate the proposed method, a dataset of images and videos, prepared by the authors, has been used. This dataset is recorded from places similar to those of a roadside camera. Most existing methods use high-quality images for evaluation and are not applicable in the real world, but in the proposed method real-world video frames are used to determine the exact type of vehicle, and the accuracy of 89.5% is achieved, which represents a good performance.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45283578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Improving Accuracy of Recommender Systems using Social Network Information and Longitudinal Data 利用社会网络信息和纵向数据提高推荐系统的准确性

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-01 DOI: 10.22044/JADM.2020.7326.1871

B. Hassanpour, N. Abdolvand, S. R. Harandi

The rapid development of technology, the Internet, and the development of electronic commerce have led to the emergence of recommender systems. These systems will assist the users in finding and selecting their desired items. The accuracy of the advice in recommender systems is one of the main challenges of these systems. Regarding the fuzzy systems capabilities in determining the borders of user interests, it seems reasonable to combine it with social networks information and the factor of time. Hence, this study, for the first time, tries to assess the efficiency of the recommender systems by combining fuzzy logic, longitudinal data and social networks information such as tags, friendship, and membership in groups. And the impact of the proposed algorithm for improving the accuracy of recommender systems was studied by specifying the neighborhood and the border between the users’ preferences over time. The results revealed that using longitudinal data in social networks information in memory-based recommender systems improves the accuracy of these systems.

技术的快速发展、互联网和电子商务的发展导致了推荐系统的出现。这些系统将帮助用户找到和选择他们想要的物品。推荐系统中建议的准确性是这些系统的主要挑战之一。关于模糊系统在确定用户兴趣边界方面的能力，将其与社交网络信息和时间因素相结合似乎是合理的。因此，本研究首次尝试通过结合模糊逻辑、纵向数据和社交网络信息（如标签、友谊和群成员）来评估推荐系统的效率。通过指定用户偏好之间的邻域和边界，研究了所提出的算法对提高推荐系统准确性的影响。结果表明，在基于记忆的推荐系统中使用社交网络中的纵向数据信息可以提高这些系统的准确性。

引用次数: 0

Chaotic-based Particle Swarm Optimization with Inertia Weight for Optimization Tasks 基于混沌的惯性权粒子群优化算法

Journal of Artificial Intelligence and Data Mining

Pub Date : 2020-07-01 DOI: 10.22044/JADM.2020.8594.1993

N. Mobaraki, R. Boostani, M. Sabeti

Among variety of meta-heuristic population-based search algorithms, particle swarm optimization (PSO) with adaptive inertia weight (AIW) has been considered as a versatile optimization tool, which incorporates the experience of the whole swarm into the movement of particles. Although the exploitation ability of this algorithm is great, it cannot comprehensively explore the search space and may be trapped in a local minimum through a limited number of iterations. To increase its diversity as well as enhancing its exploration ability, this paper inserts a chaotic factor, generated by three chaotic systems, along with a perturbation stage into AIW-PSO to avoid premature convergence, especially in complex nonlinear problems. To assess the proposed method, a known optimization benchmark containing nonlinear complex functions was selected and its results were compared to that of standard PSO, AIW-PSO and genetic algorithm (GA). The empirical results demonstrate the superiority of the proposed chaotic AIW-PSO to the counterparts over 21 functions, which confirms the promising role of inserting the randomness into the AIW-PSO. The behavior of error through the epochs show that the proposed manner can smoothly find proper minimums in a timely manner without encountering with premature convergence.

在各种基于元启发式群体的搜索算法中，具有自适应惯性权重的粒子群优化（PSO）被认为是一种通用的优化工具，它将整个群体的经验融入到粒子的运动中。尽管该算法的利用能力很大，但它不能全面探索搜索空间，并且可能通过有限的迭代次数被困在局部极小值中。为了增加其多样性并增强其探索能力，本文在AIW-PSO中插入了一个由三个混沌系统产生的混沌因子和一个扰动阶段，以避免过早收敛，特别是在复杂的非线性问题中。为了评估所提出的方法，选择了一个包含非线性复函数的已知优化基准，并将其结果与标准PSO、AIW-PSO和遗传算法（GA）的结果进行了比较。经验结果证明了所提出的混沌AIW-PSO相对于21个函数的优越性，这证实了将随机性插入AIW-PSO中的有希望的作用。通过历元的误差行为表明，所提出的方法可以及时平滑地找到合适的极小值，而不会遇到过早收敛的问题。

{"title":"Chaotic-based Particle Swarm Optimization with Inertia Weight for Optimization Tasks","authors":"N. Mobaraki, R. Boostani, M. Sabeti","doi":"10.22044/JADM.2020.8594.1993","DOIUrl":"https://doi.org/10.22044/JADM.2020.8594.1993","url":null,"abstract":"Among variety of meta-heuristic population-based search algorithms, particle swarm optimization (PSO) with adaptive inertia weight (AIW) has been considered as a versatile optimization tool, which incorporates the experience of the whole swarm into the movement of particles. Although the exploitation ability of this algorithm is great, it cannot comprehensively explore the search space and may be trapped in a local minimum through a limited number of iterations. To increase its diversity as well as enhancing its exploration ability, this paper inserts a chaotic factor, generated by three chaotic systems, along with a perturbation stage into AIW-PSO to avoid premature convergence, especially in complex nonlinear problems. To assess the proposed method, a known optimization benchmark containing nonlinear complex functions was selected and its results were compared to that of standard PSO, AIW-PSO and genetic algorithm (GA). The empirical results demonstrate the superiority of the proposed chaotic AIW-PSO to the counterparts over 21 functions, which confirms the promising role of inserting the randomness into the AIW-PSO. The behavior of error through the epochs show that the proposed manner can smoothly find proper minimums in a timely manner without encountering with premature convergence.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49397147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2