An approach to learning view-invariant object representations was explored based on the learning of legal or naturalistic view transformations in time, learned from the statistical properties of natural movies. A simple cell layer responded to localized oriented image structure, and a complex cell layer learned to respond to those subsets of simple cells with the strongest tendencies to trade off activity with each other in response to movement of the visual stimulus. Tradeoffs between simple cells were strongest in response to same-orientation translation, and fell off rapidly with changes in orientation. The local complex cell responses thus became insensitive to typical object motion, evidenced by broadening of response to stimulus phase, while remaining sensitive to local object form. The model makes predictions about synaptic learning rules in complex cells, and mechanisms of successive view-invariance in the primate ventral stream.
{"title":"Toward view-invariant representations of object structure learned using object constancy cues in natural movies","authors":"J. Colombe","doi":"10.1109/AIPR.2004.47","DOIUrl":"https://doi.org/10.1109/AIPR.2004.47","url":null,"abstract":"An approach to learning view-invariant object representations was explored based on the learning of legal or naturalistic view transformations in time, learned from the statistical properties of natural movies. A simple cell layer responded to localized oriented image structure, and a complex cell layer learned to respond to those subsets of simple cells with the strongest tendencies to trade off activity with each other in response to movement of the visual stimulus. Tradeoffs between simple cells were strongest in response to same-orientation translation, and fell off rapidly with changes in orientation. The local complex cell responses thus became insensitive to typical object motion, evidenced by broadening of response to stimulus phase, while remaining sensitive to local object form. The model makes predictions about synaptic learning rules in complex cells, and mechanisms of successive view-invariance in the primate ventral stream.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"111 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116650632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, an efficient hardware design for a nonlinear technique for enhancement of color images is presented. The enhancement technique works very effectively for images captured under extremely dark environment as well as non-uniform lighting environment where 'bright" regions are kept unaffected and 'dark' objects in 'bright' background. For efficient implementation of the nonlinear technique on a targeted FPGA board, estimation techniques for logarithm and inverse logarithm are introduced. The estimation method helps to reduce the computational time and FPGA resources significantly compared to conventional implementations of computational intensive operations such as logarithm. The enhancement technique is further analyzed and rearranged into hardware algorithmic steps to better suit the high performance implementation. A number of parallel functional modules are designed to operate simultaneously to optimally utilize the operation-level parallelism available in the technique. Sequential operations are partitioned into well-balance workload stages of a pipelined system based on the inter-data-dependency of the algorithmic steps to better utilize the resources in a FPGA such as on-chip RAM and logic-blocks. The image enhancement system is designed to target the high-performance for real time color image enhancement with minimum 25 frames per second.
{"title":"A nonlinear technique for enhancement of color images: an architectural perspective for real-time applications","authors":"H. T. Ngo, Li Tao, V. Asari","doi":"10.1109/AIPR.2004.6","DOIUrl":"https://doi.org/10.1109/AIPR.2004.6","url":null,"abstract":"In this paper, an efficient hardware design for a nonlinear technique for enhancement of color images is presented. The enhancement technique works very effectively for images captured under extremely dark environment as well as non-uniform lighting environment where 'bright\" regions are kept unaffected and 'dark' objects in 'bright' background. For efficient implementation of the nonlinear technique on a targeted FPGA board, estimation techniques for logarithm and inverse logarithm are introduced. The estimation method helps to reduce the computational time and FPGA resources significantly compared to conventional implementations of computational intensive operations such as logarithm. The enhancement technique is further analyzed and rearranged into hardware algorithmic steps to better suit the high performance implementation. A number of parallel functional modules are designed to operate simultaneously to optimally utilize the operation-level parallelism available in the technique. Sequential operations are partitioned into well-balance workload stages of a pipelined system based on the inter-data-dependency of the algorithmic steps to better utilize the resources in a FPGA such as on-chip RAM and logic-blocks. The image enhancement system is designed to target the high-performance for real time color image enhancement with minimum 25 frames per second.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125364104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a study based on the effect of using a fuzzy find matching technique with the objective of increasing the precision of textual information retrieval for image analysis while allowing for mismatches. This technique is very helpful for searching those areas of interest where chances of misspelling are more likely, for example retrieving text information from an image. The technique we propose can be used as an embedded component or a post-processing tool for image analysis resulting in a faster retrieval of corresponding information from images with given keywords. In the case of no hits, our technique outcome would suggest few close matches that can be further searched in keywords to get a closer match with improved speed of searching.
{"title":"A fuzzy find matching tool for image text analysis","authors":"S. Berkovich, M. Inayatullah","doi":"10.1109/AIPR.2004.2","DOIUrl":"https://doi.org/10.1109/AIPR.2004.2","url":null,"abstract":"In this paper, we present a study based on the effect of using a fuzzy find matching technique with the objective of increasing the precision of textual information retrieval for image analysis while allowing for mismatches. This technique is very helpful for searching those areas of interest where chances of misspelling are more likely, for example retrieving text information from an image. The technique we propose can be used as an embedded component or a post-processing tool for image analysis resulting in a faster retrieval of corresponding information from images with given keywords. In the case of no hits, our technique outcome would suggest few close matches that can be further searched in keywords to get a closer match with improved speed of searching.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121912492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Improved target detection, reduced false alarm rates, and enhanced timeliness are critical to meeting the requirements of current and future military missions. We present a new approach to target detection, based on a suite of image processing and exploitation tools developed under the intelligent searching of images and signals (ISIS) program at Los Alamos National Laboratory. Performance assessment of these algorithms relies on a new metric for scoring target detection that is relevant to the analyst's needs. An object-based loss function is defined by the degree to which the automated processing focuses the analyst's attention on the true targets and avoids false positives. For target detection techniques that produce a pixel-by-pixel classification (and thereby produce not just an identification of the target, but a segmentation as well), standard scoring rules are not appropriate because they unduly penalize partial detections. From a practical standpoint, it is not necessary to identify every single pixel that is on the target; all that is required is that the processing draw the analyst's attention to the target. By employing this scoring metric directly into the target detection algorithm, improved performance in this more practical context can be obtained.
{"title":"Approach to target detection based on relevant metric for scoring performance","authors":"J. Theiler, N. Harvey, N. David, J. Irvine","doi":"10.1109/AIPR.2004.14","DOIUrl":"https://doi.org/10.1109/AIPR.2004.14","url":null,"abstract":"Improved target detection, reduced false alarm rates, and enhanced timeliness are critical to meeting the requirements of current and future military missions. We present a new approach to target detection, based on a suite of image processing and exploitation tools developed under the intelligent searching of images and signals (ISIS) program at Los Alamos National Laboratory. Performance assessment of these algorithms relies on a new metric for scoring target detection that is relevant to the analyst's needs. An object-based loss function is defined by the degree to which the automated processing focuses the analyst's attention on the true targets and avoids false positives. For target detection techniques that produce a pixel-by-pixel classification (and thereby produce not just an identification of the target, but a segmentation as well), standard scoring rules are not appropriate because they unduly penalize partial detections. From a practical standpoint, it is not necessary to identify every single pixel that is on the target; all that is required is that the processing draw the analyst's attention to the target. By employing this scoring metric directly into the target detection algorithm, improved performance in this more practical context can be obtained.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132819389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Next-generation reconnaissance systems (NGRS) offer dynamic tasking of a menu of sensor modalities such as video, multi/hyper-spectral and polarization data. A key issue is how best to exploit these modes in time critical scenarios such as target tracking and event detection. It is essential to be able to represent diverse sensor content in a unified measurement space so that the contribution of each modality can be evaluated in terms of its contribution to the exploitation task. In this paper, mutual information is used to represent the content of individual sensor channels. A series of experiments on video tracking have been carried out to demonstrate the effectiveness of mutual information as a fusion framework. These experiments quantify the relative information content of intensity, color, and polarization image channels.
{"title":"Fusion of intensity, texture, and color in video tracking based on mutual information","authors":"J. Mundy, Chung-Fu Chang","doi":"10.1109/AIPR.2004.26","DOIUrl":"https://doi.org/10.1109/AIPR.2004.26","url":null,"abstract":"Next-generation reconnaissance systems (NGRS) offer dynamic tasking of a menu of sensor modalities such as video, multi/hyper-spectral and polarization data. A key issue is how best to exploit these modes in time critical scenarios such as target tracking and event detection. It is essential to be able to represent diverse sensor content in a unified measurement space so that the contribution of each modality can be evaluated in terms of its contribution to the exploitation task. In this paper, mutual information is used to represent the content of individual sensor channels. A series of experiments on video tracking have been carried out to demonstrate the effectiveness of mutual information as a fusion framework. These experiments quantify the relative information content of intensity, color, and polarization image channels.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133266178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The most common metric to assess a classifier's performance is the classification error rate, or the probability of misclassification (PMC). Receiver operating characteristic (ROC) analysis is a more general way to measure the performance. Some metrics that summarize the ROC curve are the two normal-deviate-axes parameters, i.e., a and b, and the area under the curve (AUC). The parameters "a" and "b" represent the intercept and slope, respectively, for the ROC curve if plotted on normal-deviate-axes scale. AUC represents the average of the classifier TPF over FPF resulting from considering different threshold values. In the present work, we used Monte-Carlo simulations to compare different bootstrap-based estimators, e.g., leave-one-out, .632, and .632+ bootstraps, to estimate the AUC. The results show the comparable performance of the different estimators in terms of RMS, while the .632+ is the least biased.
{"title":"Comparison of non-parametric methods for assessing classifier performance in terms of ROC parameters","authors":"W. Yousef, R. F. Wagner, M. Loew","doi":"10.1109/AIPR.2004.18","DOIUrl":"https://doi.org/10.1109/AIPR.2004.18","url":null,"abstract":"The most common metric to assess a classifier's performance is the classification error rate, or the probability of misclassification (PMC). Receiver operating characteristic (ROC) analysis is a more general way to measure the performance. Some metrics that summarize the ROC curve are the two normal-deviate-axes parameters, i.e., a and b, and the area under the curve (AUC). The parameters \"a\" and \"b\" represent the intercept and slope, respectively, for the ROC curve if plotted on normal-deviate-axes scale. AUC represents the average of the classifier TPF over FPF resulting from considering different threshold values. In the present work, we used Monte-Carlo simulations to compare different bootstrap-based estimators, e.g., leave-one-out, .632, and .632+ bootstraps, to estimate the AUC. The results show the comparable performance of the different estimators in terms of RMS, while the .632+ is the least biased.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"449 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116189182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe an improvement on the mountain method (MM) of clustering originally proposed by Yager and Filev. The new technique employs a data-driven, hierarchical partitioning of the data set to be clustered, using a "p-tree" algorithm. The centroids of data subsets in the terminal nodes of the p-tree are the set of candidate cluster centers to which the iterative candidate cluster center selection process of MM is applied. As the data dimension and/or the number of uniform grid lines used in the original MM increases, our approach requires exponentially fewer cluster centers to be evaluated by the MM selection algorithm. Sample data sets illustrate the performance of this new technique.
{"title":"Mountain clustering on nonuniform grids","authors":"J. T. Rickard, R. Yager, W. Miller","doi":"10.1109/AIPR.2004.31","DOIUrl":"https://doi.org/10.1109/AIPR.2004.31","url":null,"abstract":"We describe an improvement on the mountain method (MM) of clustering originally proposed by Yager and Filev. The new technique employs a data-driven, hierarchical partitioning of the data set to be clustered, using a \"p-tree\" algorithm. The centroids of data subsets in the terminal nodes of the p-tree are the set of candidate cluster centers to which the iterative candidate cluster center selection process of MM is applied. As the data dimension and/or the number of uniform grid lines used in the original MM increases, our approach requires exponentially fewer cluster centers to be evaluated by the MM selection algorithm. Sample data sets illustrate the performance of this new technique.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129885192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Here a method is presented which recognizes a face by capturing a unique illumination spot created in the centre of the eye i.e. the cornea. This illumination spot is always there whenever we are in light or in front of a light source. This algorithm works by running an edge detector on a sample face. The edge detector produces a dark spot in place of the illumination spot and a semicircular arc in place of the beginning of the cornea. This arc and the dark spot act as a unique template for capturing eye in the image and following this we extract other features i.e. eyebrows, lip line and chin line. Following this we calculate the vertical distance of each feature from every other feature and use these distance parameters to recognize faces.
{"title":"Face recognition by capturing eye illumination spot","authors":"N. Singh","doi":"10.1109/AIPR.2004.24","DOIUrl":"https://doi.org/10.1109/AIPR.2004.24","url":null,"abstract":"Here a method is presented which recognizes a face by capturing a unique illumination spot created in the centre of the eye i.e. the cornea. This illumination spot is always there whenever we are in light or in front of a light source. This algorithm works by running an edge detector on a sample face. The edge detector produces a dark spot in place of the illumination spot and a semicircular arc in place of the beginning of the cornea. This arc and the dark spot act as a unique template for capturing eye in the image and following this we extract other features i.e. eyebrows, lip line and chin line. Following this we calculate the vertical distance of each feature from every other feature and use these distance parameters to recognize faces.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129064911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, recognition of ancient middle Persian documents is studied. Our major attention has been focused on feature extraction and classification. A set of invariant moments has been selected as the features and the minimum mean distance (three versions of which that is called MMD1, MMD2, MMD3), KNN and Parzen as the classifier. Preprocessing is also considered in this paper which allows, the effects of under sampling (resolution pyramids), smoothing, and thinning be investigated. The algorithm has been tested not only on the original and smoothed images but also on the skeletonized and under sampled version of the text under test. The results show an acceptable recognition rate with the selected features with the proposed processing for the middle age Persian. The best-achieved classification rates are 95% and 90.5% for smoothed and original character images respectively. It was interesting to note that KNN and MMD2 classifiers yielded better recognition rate.
{"title":"Recognition of middle age Persian characters using a set of invariant moments","authors":"S. Alirezaee, H. Aghaeinia, M. Ahmadi, K. Faez","doi":"10.1109/AIPR.2004.39","DOIUrl":"https://doi.org/10.1109/AIPR.2004.39","url":null,"abstract":"In this paper, recognition of ancient middle Persian documents is studied. Our major attention has been focused on feature extraction and classification. A set of invariant moments has been selected as the features and the minimum mean distance (three versions of which that is called MMD1, MMD2, MMD3), KNN and Parzen as the classifier. Preprocessing is also considered in this paper which allows, the effects of under sampling (resolution pyramids), smoothing, and thinning be investigated. The algorithm has been tested not only on the original and smoothed images but also on the skeletonized and under sampled version of the text under test. The results show an acceptable recognition rate with the selected features with the proposed processing for the middle age Persian. The best-achieved classification rates are 95% and 90.5% for smoothed and original character images respectively. It was interesting to note that KNN and MMD2 classifiers yielded better recognition rate.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124605979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Ejofodomi, Shani Ross, A. Jendoubi, M. Chouikha, J. Zeng
Tablet PC-based engineering software can be used as an effective teaching tool for core engineering courses such as electronics, signal and systems, and digital systems. Wireless connection between tablet PCs of students and the teaching professor will substantially improve students' involvement during the course. Circuit drawing is an important task especially in undergraduate courses such as electronics and digital systems. Most existing software tools for circuit drawing use a toolbox where symbols for all circuit components are prepared and ready for pick up. A user has to go through a number of layers of menus each time he/she wants to use a circuit symbol. To improve human computer interaction, we have developed an online recognition system on a tablet PC using C# for the handwritten circuit and its components. The system can recognize and redraw many circuits and their components such as resistors, capacitors, ground and various voltage power supplies, which are drawn with a stylus pen on a tablet PC. We present details of our approach and preliminary results of an experimental system.
{"title":"Online handwritten circuit recognition on a tablet PC","authors":"O. Ejofodomi, Shani Ross, A. Jendoubi, M. Chouikha, J. Zeng","doi":"10.1109/AIPR.2004.35","DOIUrl":"https://doi.org/10.1109/AIPR.2004.35","url":null,"abstract":"Tablet PC-based engineering software can be used as an effective teaching tool for core engineering courses such as electronics, signal and systems, and digital systems. Wireless connection between tablet PCs of students and the teaching professor will substantially improve students' involvement during the course. Circuit drawing is an important task especially in undergraduate courses such as electronics and digital systems. Most existing software tools for circuit drawing use a toolbox where symbols for all circuit components are prepared and ready for pick up. A user has to go through a number of layers of menus each time he/she wants to use a circuit symbol. To improve human computer interaction, we have developed an online recognition system on a tablet PC using C# for the handwritten circuit and its components. The system can recognize and redraw many circuits and their components such as resistors, capacitors, ground and various voltage power supplies, which are drawn with a stylus pen on a tablet PC. We present details of our approach and preliminary results of an experimental system.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"60 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127384532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}