Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048417
H. Altınçay, M. Demirekler
A combination of classifiers is a promising direction for obtaining better classification systems. However the outputs of different classifiers may have different scales and hence the classifier outputs are incomparable. Incomparability of the classifier output scores is a major problem in the combination of different classification systems. In order to avoid this problem, the measurement level classifier outputs are generally normalized. However recent studies have proven that output normalization may provide some problems. For instance, the multiple classifier system's performance may become worse than that of a single individual classifier. This paper presents some interesting observations about the reason why such undesirable behavior occurs.
{"title":"Why does output normalization create problems in multiple classifier systems?","authors":"H. Altınçay, M. Demirekler","doi":"10.1109/ICPR.2002.1048417","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048417","url":null,"abstract":"A combination of classifiers is a promising direction for obtaining better classification systems. However the outputs of different classifiers may have different scales and hence the classifier outputs are incomparable. Incomparability of the classifier output scores is a major problem in the combination of different classification systems. In order to avoid this problem, the measurement level classifier outputs are generally normalized. However recent studies have proven that output normalization may provide some problems. For instance, the multiple classifier system's performance may become worse than that of a single individual classifier. This paper presents some interesting observations about the reason why such undesirable behavior occurs.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134645401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048245
T. Windeatt, G. Ardeshir
Output coding is a method of converting a multiclass problem into several binary subproblems and gives an ensemble of binary classifiers. Like other ensemble methods, its performance depends on the accuracy and diversity of base classifiers. If a decision tree is chosen as base classifier the issue of tree pruning needs to be addressed. In this paper we investigate the effect of six methods of pruning on ensembles of trees generated by error-correcting output code (ECOC). Our results show that error-based pruning outperforms on most datasets but it is better not to prune than to select a single pruning strategy for all datasets.
{"title":"Tree pruning for output coded ensembles","authors":"T. Windeatt, G. Ardeshir","doi":"10.1109/ICPR.2002.1048245","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048245","url":null,"abstract":"Output coding is a method of converting a multiclass problem into several binary subproblems and gives an ensemble of binary classifiers. Like other ensemble methods, its performance depends on the accuracy and diversity of base classifiers. If a decision tree is chosen as base classifier the issue of tree pruning needs to be addressed. In this paper we investigate the effect of six methods of pruning on ensembles of trees generated by error-correcting output code (ECOC). Our results show that error-based pruning outperforms on most datasets but it is better not to prune than to select a single pruning strategy for all datasets.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115679718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048358
Gaurav Garg, P. Sharma, S. Chaudhury, R. Choudhury
We describe a novel appearance based scheme for extraction and representation of video objects. The tracking algorithm used for video object extraction is based upon a new eigen-space update scheme. We propose a scheme for organisation of video objects in an appearance based hierarchy. The appearance based hierarchy is constructed using a new SVD based eigen-space merging algorithm. The hierarchy enables approximate query resolution. Experiments performed on a large number of video sequences have yielded promising results.
{"title":"An appearance based approach for video object extraction and representation","authors":"Gaurav Garg, P. Sharma, S. Chaudhury, R. Choudhury","doi":"10.1109/ICPR.2002.1048358","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048358","url":null,"abstract":"We describe a novel appearance based scheme for extraction and representation of video objects. The tracking algorithm used for video object extraction is based upon a new eigen-space update scheme. We propose a scheme for organisation of video objects in an appearance based hierarchy. The appearance based hierarchy is constructed using a new SVD based eigen-space merging algorithm. The hierarchy enables approximate query resolution. Experiments performed on a large number of video sequences have yielded promising results.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115712994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048330
N. Ancona, G. Cicirelli, E. Stella, A. Distante
We address two aspects related to the exploitation of support vector machines (SVM) for classification in real application domains, such as the detection of objects in images. The first one concerns the reduction of the run-time complexity of a reference classifier without increasing its generalization error. We show that the complexity in test phase can be reduced by training SVM classifiers on a new set of features obtained by using principal component analysis (PCA). Moreover due to the small number of features involved, we explicitly map the new input space in the feature space induced by the adopted kernel function. Since the classifier is simply a hyperplane in the feature space, then the classification of a new pattern involves only the computation of a dot product between the normal to the hyperplane and the pattern. The second issue concerns the problem of parameter selection. In particular we show that the receiver operating characteristic curves, measured on a suitable validation set, are effective for selecting, among the classifiers the machine implements, the one having performances similar to the reference classifier. We address these two issues for the particular application of detecting goals during a football match.
{"title":"Object detection in images: run-time complexity and parameter selection of support vector machines","authors":"N. Ancona, G. Cicirelli, E. Stella, A. Distante","doi":"10.1109/ICPR.2002.1048330","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048330","url":null,"abstract":"We address two aspects related to the exploitation of support vector machines (SVM) for classification in real application domains, such as the detection of objects in images. The first one concerns the reduction of the run-time complexity of a reference classifier without increasing its generalization error. We show that the complexity in test phase can be reduced by training SVM classifiers on a new set of features obtained by using principal component analysis (PCA). Moreover due to the small number of features involved, we explicitly map the new input space in the feature space induced by the adopted kernel function. Since the classifier is simply a hyperplane in the feature space, then the classification of a new pattern involves only the computation of a dot product between the normal to the hyperplane and the pattern. The second issue concerns the problem of parameter selection. In particular we show that the receiver operating characteristic curves, measured on a suitable validation set, are effective for selecting, among the classifiers the machine implements, the one having performances similar to the reference classifier. We address these two issues for the particular application of detecting goals during a football match.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114862965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048390
Hwann-Tzong Chen, Tyng-Luh Liu
We describe a probabilistic framework based on a trust-region method to track rigid or non-rigid objects with automatic optimal scale and orientation selection. The approach uses a flexible probability model to represent an object by its salient features such as color or intensity gradient. Depending on the weighting scheme, features will contribute to the distribution differently according to their positions. We adopt a bivariate normal as the weighting function that only features within the induced covariance ellipse are considered. Notice that characterizing an object by a covariance ellipse makes it easier to define its orientation and scale. To perform tracking, a trust-region scheme is carried out for each image frame to detect a distribution similar to the target's accounting for the translation, scale, and orientation factors simultaneously. Unlike other work, the optimization process is executed over a continuous space. Consequently, our method is more robust and accurate as demonstrated in the experimental results.
{"title":"Probabilistic tracking with optimal scale and orientation selection","authors":"Hwann-Tzong Chen, Tyng-Luh Liu","doi":"10.1109/ICPR.2002.1048390","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048390","url":null,"abstract":"We describe a probabilistic framework based on a trust-region method to track rigid or non-rigid objects with automatic optimal scale and orientation selection. The approach uses a flexible probability model to represent an object by its salient features such as color or intensity gradient. Depending on the weighting scheme, features will contribute to the distribution differently according to their positions. We adopt a bivariate normal as the weighting function that only features within the induced covariance ellipse are considered. Notice that characterizing an object by a covariance ellipse makes it easier to define its orientation and scale. To perform tracking, a trust-region scheme is carried out for each image frame to detect a distribution similar to the target's accounting for the translation, scale, and orientation factors simultaneously. Unlike other work, the optimization process is executed over a continuous space. Consequently, our method is more robust and accurate as demonstrated in the experimental results.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124803810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048262
Hyun-Chul Kim, Shaoning Pang, Hong-Mo Je, Daijin Kim, S. Bang
While the support vector machine (SVM) can provide a good generalization performance, the classification result of the SVM is often far from the theoretically expected level in practical implementation because they are based on approximated algorithms due to the high complexity of time and space. To improve the limited classification performance of the real SVM, we propose to use an SVM ensemble with bagging (bootstrap aggregating) or boosting. In bagging, each individual SVM is trained independently, using randomly chosen training samples via a bootstrap technique. In boosting, each individual SVM is trained using training samples chosen according to the sample's probability distribution, which is updated in proportion to the degree of error of the sample. In both bagging and boosting, the trained individual SVMs are aggregated to make a collective decision in several ways, such as majority voting, least squares estimation based weighting, and double-layer hierarchical combination. Various simulation results for handwritten digit recognition and fraud detection show that the proposed SVM ensemble with bagging or boosting greatly outperforms a single SVM in terms of classification accuracy.
{"title":"Pattern classification using support vector machine ensemble","authors":"Hyun-Chul Kim, Shaoning Pang, Hong-Mo Je, Daijin Kim, S. Bang","doi":"10.1109/ICPR.2002.1048262","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048262","url":null,"abstract":"While the support vector machine (SVM) can provide a good generalization performance, the classification result of the SVM is often far from the theoretically expected level in practical implementation because they are based on approximated algorithms due to the high complexity of time and space. To improve the limited classification performance of the real SVM, we propose to use an SVM ensemble with bagging (bootstrap aggregating) or boosting. In bagging, each individual SVM is trained independently, using randomly chosen training samples via a bootstrap technique. In boosting, each individual SVM is trained using training samples chosen according to the sample's probability distribution, which is updated in proportion to the degree of error of the sample. In both bagging and boosting, the trained individual SVMs are aggregated to make a collective decision in several ways, such as majority voting, least squares estimation based weighting, and double-layer hierarchical combination. Various simulation results for handwritten digit recognition and fraud detection show that the proposed SVM ensemble with bagging or boosting greatly outperforms a single SVM in terms of classification accuracy.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129423185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048483
M. Teraguchi, Ken Masumitsu, T. Echigo, Shun-ichi Sekiguchi, M. Etoh
This paper presents a novel video indexing method for providing timely personalized video delivery services. Several previous research reports have dealt with automatic content-based indexing. However, these systems require a lot of time to manually correct the indexes after preliminary automatic processing, making it difficult to use these indexes in practical services. In our method, reliable content-based indexes can be automatically generated by manually flagging predefined and easily recognizable events while watching the video (without rewinding). The generated indexes use temporal functions of significance based on some parameters predefined for each kind of flagged event. Therefore, we can reduce the time required for flagging and achieve rapid semi-manual video indexing. Finally, we confirm effectiveness of the indexes transformed into a subset of MPEG-7 descriptions in a practical service to automatically construct personalized video digests.
{"title":"Rapid generation of event-based indexes for personalized video digests","authors":"M. Teraguchi, Ken Masumitsu, T. Echigo, Shun-ichi Sekiguchi, M. Etoh","doi":"10.1109/ICPR.2002.1048483","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048483","url":null,"abstract":"This paper presents a novel video indexing method for providing timely personalized video delivery services. Several previous research reports have dealt with automatic content-based indexing. However, these systems require a lot of time to manually correct the indexes after preliminary automatic processing, making it difficult to use these indexes in practical services. In our method, reliable content-based indexes can be automatically generated by manually flagging predefined and easily recognizable events while watching the video (without rewinding). The generated indexes use temporal functions of significance based on some parameters predefined for each kind of flagged event. Therefore, we can reduce the time required for flagging and achieve rapid semi-manual video indexing. Finally, we confirm effectiveness of the indexes transformed into a subset of MPEG-7 descriptions in a practical service to automatically construct personalized video digests.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129507776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048462
C. Hung, Shisong Yang, C. Laymon
This paper addresses the problem of image texture classification. We present a novel texture feature called "characteristic view", which is directly extracted from a sample sub-image corresponding to each texture class. The K-views template method is proposed to classify the texture pixels based on these features. The characteristic view concept is based on the assumption that in an image taken from the nature scenes, a specific texture class in this image will frequently reveal the repetitions of some certain classes of features. Different "views" can be obtained for these features from different spatial locations. Experimental results show the effectiveness of the proposed approach compared with other methods.
{"title":"Use of characteristic views in image classification","authors":"C. Hung, Shisong Yang, C. Laymon","doi":"10.1109/ICPR.2002.1048462","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048462","url":null,"abstract":"This paper addresses the problem of image texture classification. We present a novel texture feature called \"characteristic view\", which is directly extracted from a sample sub-image corresponding to each texture class. The K-views template method is proposed to classify the texture pixels based on these features. The characteristic view concept is based on the assumption that in an image taken from the nature scenes, a specific texture class in this image will frequently reveal the repetitions of some certain classes of features. Different \"views\" can be obtained for these features from different spatial locations. Experimental results show the effectiveness of the proposed approach compared with other methods.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130750499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048460
Zeyun Yu, C. Bajaj
The Active Contour (or Snake) Model is recognized as one of the efficient tools for 2D/3D image segmentation. However traditional snake models prove to be limited in several aspects. The present paper describes a set of diffusion equations applied to image gradient vectors, yielding a vector field over the image domain. The obtained vector field provides the Snake Model with an external force as well as an automatic way to generate the initial contours. Finally a region merging technique is employed to further improve the segmentation results.
{"title":"Image segmentation using gradient vector diffusion and region merging","authors":"Zeyun Yu, C. Bajaj","doi":"10.1109/ICPR.2002.1048460","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048460","url":null,"abstract":"The Active Contour (or Snake) Model is recognized as one of the efficient tools for 2D/3D image segmentation. However traditional snake models prove to be limited in several aspects. The present paper describes a set of diffusion equations applied to image gradient vectors, yielding a vector field over the image domain. The obtained vector field provides the Snake Model with an external force as well as an automatic way to generate the initial contours. Finally a region merging technique is employed to further improve the segmentation results.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128796879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048491
A. Kojima, Takeshi Tamura, K. Fukunaga
We propose a method for describing human activities from video images by tracking human skin regions: facial and hand regions. To detect skin regions robustly, three kinds of probabilistic information are extracted and integrated using Dempster-Shafer theory. The main difficulty in transforming video images into textual descriptions is bridging the semantic gap between them. By associating visual features of head and hand motion with natural language concepts, appropriate syntactic components such as verbs, objects, etc. are determined and translated into natural language.
{"title":"Textual description of human activities by tracking head and hand motions","authors":"A. Kojima, Takeshi Tamura, K. Fukunaga","doi":"10.1109/ICPR.2002.1048491","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048491","url":null,"abstract":"We propose a method for describing human activities from video images by tracking human skin regions: facial and hand regions. To detect skin regions robustly, three kinds of probabilistic information are extracted and integrated using Dempster-Shafer theory. The main difficulty in transforming video images into textual descriptions is bridging the semantic gap between them. By associating visual features of head and hand motion with natural language concepts, appropriate syntactic components such as verbs, objects, etc. are determined and translated into natural language.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"651 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123349288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}