{"title":"Combination and optimization of classifiers in gender classification using genetic programming","authors":"Asifullah Khan, A. Majid, A. M. Mirza","doi":"10.3233/KES-2005-9101","DOIUrl":null,"url":null,"abstract":"In this paper, we have investigated the problem of gender classification using frontal facial images. Four different classifiers, namely K-means, k-nearest neighbors, Linear Discriminant Analysis and Mahalanobis Distance Based classifiers are compared. Receiver operating characteristics (ROC) curve along with the area under the convex hull (AUCH) have been utilized as the performance measures of the classifiers at different feature subsets. To measure the overall performance of a classifier with single scalar value, the new scheme of finding the area under the convex hull of AUCH of ROC curves (AUCH of AUCHS) is proposed. It has been observed that, when the number of macro features is increased beyond 5, the AUCH saturates and even decreases for some classifiers, illustrating the curse of dimensionality. We then used genetic programming to combine classifiers and thus evolved an optimum combined classifier (OCC), producing better performance than the individual classifiers. We found that using only two features, the OCC has comparable performance to that of original classifier using 20 macro features. It produces true positive rate values as high as 0.94 corresponding to false positive rate as low as 0.15 for 1: 3 train to testing ratio. We also observed that heterogeneous combination of classifiers is more promising than the homogenous combination.","PeriodicalId":44076,"journal":{"name":"International Journal of Knowledge-Based and Intelligent Engineering Systems","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Knowledge-Based and Intelligent Engineering Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/KES-2005-9101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 49
Abstract
In this paper, we have investigated the problem of gender classification using frontal facial images. Four different classifiers, namely K-means, k-nearest neighbors, Linear Discriminant Analysis and Mahalanobis Distance Based classifiers are compared. Receiver operating characteristics (ROC) curve along with the area under the convex hull (AUCH) have been utilized as the performance measures of the classifiers at different feature subsets. To measure the overall performance of a classifier with single scalar value, the new scheme of finding the area under the convex hull of AUCH of ROC curves (AUCH of AUCHS) is proposed. It has been observed that, when the number of macro features is increased beyond 5, the AUCH saturates and even decreases for some classifiers, illustrating the curse of dimensionality. We then used genetic programming to combine classifiers and thus evolved an optimum combined classifier (OCC), producing better performance than the individual classifiers. We found that using only two features, the OCC has comparable performance to that of original classifier using 20 macro features. It produces true positive rate values as high as 0.94 corresponding to false positive rate as low as 0.15 for 1: 3 train to testing ratio. We also observed that heterogeneous combination of classifiers is more promising than the homogenous combination.
本文研究了基于正面人脸图像的性别分类问题。比较了四种不同的分类器,即K-means、k-nearest neighbors、Linear Discriminant Analysis和Mahalanobis Distance Based classifier。利用接收者工作特征(ROC)曲线和凸壳下面积(AUCH)作为分类器在不同特征子集上的性能度量。为了衡量具有单一标量值的分类器的整体性能,提出了寻找ROC曲线的AUCH凸壳下面积(AUCH of AUCHS)的新方案。我们观察到,当宏观特征的数量增加到5个以上时,一些分类器的AUCH饱和甚至减少,这说明了维数的诅咒。然后,我们使用遗传编程来组合分类器,从而进化出最优组合分类器(OCC),产生比单个分类器更好的性能。我们发现,仅使用两个特征,OCC的性能与使用20个宏特征的原始分类器相当。当训练与测试比为1:3时,其产生的真阳性率高达0.94,对应的假阳性率低至0.15。我们还观察到,分类器的异质组合比同质组合更有希望。