Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759684
W. Yousef, Waleed Mustafa, Ali A. Ali, Naglaa A. Abdelrazek, Ahmed M. Farrag
Breast cancer is the most common cancer in many countries all over the world. Early detection of cancer, in either diagnosis or screening programs, decreases the mortality rates. Computer Aided Detection (CAD) is software that aids radiologists in detecting abnormalities in medical images. In this article we present our approach in detecting abnormalities in mammograms using digital mammography. Each mammogram in our dataset is manually processed — using software specially developed for that purpose — by a radiologist to mark and label different types of abnormalities. Once marked, processing henceforth is applied using computer algorithms. The majority of existing detection techniques relies on image processing (IP) to extract Regions of Interests (ROI) then extract features from those ROIs to be the input of a statistical learning machine (classifier). Detection, in this approach, is basically done at the IP phase; while the ultimate role of classifiers is to reduce the number of False Positives (FP) detected in the IP phase. In contrast, processing algorithms and classifiers, in pixel-based approach, work directly at the pixel level. We demonstrate the performance of some methods belonging to this approach and suggest an assessment metric in terms of the Mann Whitney statistic.
{"title":"On detecting abnormalities in digital mammography","authors":"W. Yousef, Waleed Mustafa, Ali A. Ali, Naglaa A. Abdelrazek, Ahmed M. Farrag","doi":"10.1109/AIPR.2010.5759684","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759684","url":null,"abstract":"Breast cancer is the most common cancer in many countries all over the world. Early detection of cancer, in either diagnosis or screening programs, decreases the mortality rates. Computer Aided Detection (CAD) is software that aids radiologists in detecting abnormalities in medical images. In this article we present our approach in detecting abnormalities in mammograms using digital mammography. Each mammogram in our dataset is manually processed — using software specially developed for that purpose — by a radiologist to mark and label different types of abnormalities. Once marked, processing henceforth is applied using computer algorithms. The majority of existing detection techniques relies on image processing (IP) to extract Regions of Interests (ROI) then extract features from those ROIs to be the input of a statistical learning machine (classifier). Detection, in this approach, is basically done at the IP phase; while the ultimate role of classifiers is to reduce the number of False Positives (FP) detected in the IP phase. In contrast, processing algorithms and classifiers, in pixel-based approach, work directly at the pixel level. We demonstrate the performance of some methods belonging to this approach and suggest an assessment metric in terms of the Mann Whitney statistic.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121511796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759697
J. Scott, M. Pusateri
A fundamental goal in multispectral image fusion is to combine relevant information from multiple spectral ranges while displaying a constant amount of data as a single channel. Because we expect synergy between the views afforded by different parts of the spectrum, producing output imagery with increased information beyond any of the individual imagery sounds simple. While fusion algorithms achieve synergy under specific scenarios, it is often the case that they produce imagery with less information than any single band of imagery. Losses can arise from any number of problems including poor imagery in one band degrading the fusion result, loss of details from intrinsic smoothing, artifacts or discontinuities from discrete mixing, and distracting colors from unnatural color mapping. We have been developing and testing fusion algorithms with the goal of achieving synergy under a wider range of scenarios. This technique has been very successful in the worlds of image blending, mosaics, and image compositing for visible band imagery. The algorithm presented in this paper is based on direct pixel-wise fusion that merges the directional discrete laplacian content of individual imagery bands rather than the intensities directly. The laplacian captures the local difference in the four-connected neighborhood. The laplacian of each image is then mixed based on the premise that image edges contain the most pertinent information from each input image. This information is then reformed into an image by solving the two-dimensional Poisson equation. The preliminary results are promising and consistent. When fusing multiple continuous visible channels, the resulting image is similar to grayscale imaging over all of the visible channels. When fusing discontinuous and/or non-visible channels, the resulting image is subtly mixed and intuitive to understand.
{"title":"Laplacian based image fusion","authors":"J. Scott, M. Pusateri","doi":"10.1109/AIPR.2010.5759697","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759697","url":null,"abstract":"A fundamental goal in multispectral image fusion is to combine relevant information from multiple spectral ranges while displaying a constant amount of data as a single channel. Because we expect synergy between the views afforded by different parts of the spectrum, producing output imagery with increased information beyond any of the individual imagery sounds simple. While fusion algorithms achieve synergy under specific scenarios, it is often the case that they produce imagery with less information than any single band of imagery. Losses can arise from any number of problems including poor imagery in one band degrading the fusion result, loss of details from intrinsic smoothing, artifacts or discontinuities from discrete mixing, and distracting colors from unnatural color mapping. We have been developing and testing fusion algorithms with the goal of achieving synergy under a wider range of scenarios. This technique has been very successful in the worlds of image blending, mosaics, and image compositing for visible band imagery. The algorithm presented in this paper is based on direct pixel-wise fusion that merges the directional discrete laplacian content of individual imagery bands rather than the intensities directly. The laplacian captures the local difference in the four-connected neighborhood. The laplacian of each image is then mixed based on the premise that image edges contain the most pertinent information from each input image. This information is then reformed into an image by solving the two-dimensional Poisson equation. The preliminary results are promising and consistent. When fusing multiple continuous visible channels, the resulting image is similar to grayscale imaging over all of the visible channels. When fusing discontinuous and/or non-visible channels, the resulting image is subtly mixed and intuitive to understand.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130218604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759714
A. Cheriyadat, Ranga Raju Vatsavai, E. Bright
Human settlement regions with different physical and socio-economic attributes exhibit unique spatial characteristics that are often illustrated in high-resolution overhead imageries. For example-size, shape and spatial arrangements of man-made structures are key attributes that vary with respect to the socio-economic profile of the neighborhood. Successfully modeling these attributes is crucial in developing advanced image understanding systems for interpreting complex aerial scenes. In this paper we present three different approaches to model the spatial context in the overhead imagery. First, we show that the frequency domain of the image can be used to model the spatial context [1]. The shape of the spectral energy contours characterize the scene context and can be exploited as global features. Secondly, we explore a discriminative framework based on the Conditional Random Fields (CRF) [2] to model the spatial context in the overhead imagery. The features derived from the edge orientation distribution calculated for a neighborhood and the associated class labels are used as input features to model the spatial context. Our third approach is based on grouping spatially connected pixels based on the low-level edge primitives to form support-regions [3]. The statistical parameters generated from the support-region feature distributions characterize different geospatial neighborhoods. We apply our approaches on high-resolution overhead imageries. We show that proposed approaches characterize the spatial context in overhead imageries.
具有不同自然和社会经济属性的人类住区区域表现出独特的空间特征,这些特征通常在高分辨率架空图像中得到说明。例如,人造结构的大小、形状和空间安排是根据社区的社会经济状况而变化的关键属性。成功地对这些属性进行建模对于开发用于解释复杂航拍场景的高级图像理解系统至关重要。在本文中,我们提出了三种不同的方法来模拟架空图像中的空间背景。首先,我们证明了图像的频域可以用来对空间背景进行建模[1]。光谱能量轮廓的形状表征了场景背景,可以作为全局特征加以利用。其次,我们探索了一个基于条件随机场(Conditional Random Fields, CRF)的判别框架[2],对架空图像中的空间背景进行建模。从邻域计算的边缘方向分布中得到的特征和相关的类标签被用作空间上下文建模的输入特征。我们的第三种方法是基于基于底层边缘原语的空间连接像素分组来形成支持区域[3]。从支持区域特征分布生成的统计参数表征了不同的地理空间邻域。我们将我们的方法应用于高分辨率的头顶图像。我们展示了所提出的方法表征了头顶图像中的空间背景。
{"title":"Modeling spatial dependencies in high-resolution overhead imagery","authors":"A. Cheriyadat, Ranga Raju Vatsavai, E. Bright","doi":"10.1109/AIPR.2010.5759714","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759714","url":null,"abstract":"Human settlement regions with different physical and socio-economic attributes exhibit unique spatial characteristics that are often illustrated in high-resolution overhead imageries. For example-size, shape and spatial arrangements of man-made structures are key attributes that vary with respect to the socio-economic profile of the neighborhood. Successfully modeling these attributes is crucial in developing advanced image understanding systems for interpreting complex aerial scenes. In this paper we present three different approaches to model the spatial context in the overhead imagery. First, we show that the frequency domain of the image can be used to model the spatial context [1]. The shape of the spectral energy contours characterize the scene context and can be exploited as global features. Secondly, we explore a discriminative framework based on the Conditional Random Fields (CRF) [2] to model the spatial context in the overhead imagery. The features derived from the edge orientation distribution calculated for a neighborhood and the associated class labels are used as input features to model the spatial context. Our third approach is based on grouping spatially connected pixels based on the low-level edge primitives to form support-regions [3]. The statistical parameters generated from the support-region feature distributions characterize different geospatial neighborhoods. We apply our approaches on high-resolution overhead imageries. We show that proposed approaches characterize the spatial context in overhead imageries.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759701
Erhan Guven, P. Bock
The classification of emotions, such as joy, anger, anxiety, etc. from tonal variations in human speech is an important task for research and applications in human computer interaction. In the preceding work, it has been demonstrated that the locally extracted features of speech match or surpass the performance of global features that has been adopted in current approaches. In this continuing research, a backward context, which also can be considered as a feature vector memory, is shown to improve the prediction accuracy of the Speech Emotion Recognition engine. Preliminary results on German emotional speech database illustrate significant improvements over results from the previous study.
{"title":"Speech Emotion Recognition using a backward context","authors":"Erhan Guven, P. Bock","doi":"10.1109/AIPR.2010.5759701","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759701","url":null,"abstract":"The classification of emotions, such as joy, anger, anxiety, etc. from tonal variations in human speech is an important task for research and applications in human computer interaction. In the preceding work, it has been demonstrated that the locally extracted features of speech match or surpass the performance of global features that has been adopted in current approaches. In this continuing research, a backward context, which also can be considered as a feature vector memory, is shown to improve the prediction accuracy of the Speech Emotion Recognition engine. Preliminary results on German emotional speech database illustrate significant improvements over results from the previous study.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123491680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759695
D. Natale, Matthew S. Baran, R. Tutwiler
It is now the case that well-performing flash LIDAR focal plane array devices are commercially available. Such devices give us the ability to measure and record frame-registered 3D point cloud sequences at video frame rates. For many 3D computer vision applications this allows the processes of structure from motion or multi-view stereo reconstruction to be circumvented. This allows us to construct simpler, more efficient, and more robust 3D computer vision systems. This is a particular advantage for ground-based vision tasks which necessitate real-time or near real-time operation. The goal of this work is introduce several important considerations for dealing with commercial 3D Flash LIDAR data and to describe useful strategies for noise filtering, structural segmentation, and meshing of ground-based data. With marginal refinement efforts the results of this work are directly applicable to many ground-based computer vision tasks.
{"title":"Point cloud processing strategies for noise filtering, structural segmentation, and meshing of ground-based 3D Flash LIDAR images","authors":"D. Natale, Matthew S. Baran, R. Tutwiler","doi":"10.1109/AIPR.2010.5759695","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759695","url":null,"abstract":"It is now the case that well-performing flash LIDAR focal plane array devices are commercially available. Such devices give us the ability to measure and record frame-registered 3D point cloud sequences at video frame rates. For many 3D computer vision applications this allows the processes of structure from motion or multi-view stereo reconstruction to be circumvented. This allows us to construct simpler, more efficient, and more robust 3D computer vision systems. This is a particular advantage for ground-based vision tasks which necessitate real-time or near real-time operation. The goal of this work is introduce several important considerations for dealing with commercial 3D Flash LIDAR data and to describe useful strategies for noise filtering, structural segmentation, and meshing of ground-based data. With marginal refinement efforts the results of this work are directly applicable to many ground-based computer vision tasks.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123715062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759689
Guang Yang, Yafeng Yin, H. Man
The fusion of images captured from Electrical-Optical (EO) and Infra-Red (IR) cameras has been extensively studied for military applications in recent years. In this paper, we propose a novel wavelet-based framework for online fusion of EO and IR image sequences. The proposed framework provides multiple fusion rules for image fusion as well as a novel edge-based evaluation method to select the optimal fusion rule with respect to different practical scenarios. In the fusion step, EO and IR images are decomposed into different levels by 2D discrete wavelet transform. The wavelet coefficients at each level are combined by a set of fusion rules, such as min-max selection, mean-value, weighted summations, etc. Various fused images are obtained by inverse wavelet transform of combined coefficients. In the evaluation step, Sobel operator is applied on both the fused images and original images. Compared with original images, the remaining edge information in the fused each image is calculated as the fusion quality assessment. Finally, the fused image with the highest assessment value will be selected as the fusion result. In addition, the proposed method can adaptively select the best fusion rule for EO and IR images under different scenarios.
{"title":"Adaptive selection of visual and infra-red image fusion rules","authors":"Guang Yang, Yafeng Yin, H. Man","doi":"10.1109/AIPR.2010.5759689","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759689","url":null,"abstract":"The fusion of images captured from Electrical-Optical (EO) and Infra-Red (IR) cameras has been extensively studied for military applications in recent years. In this paper, we propose a novel wavelet-based framework for online fusion of EO and IR image sequences. The proposed framework provides multiple fusion rules for image fusion as well as a novel edge-based evaluation method to select the optimal fusion rule with respect to different practical scenarios. In the fusion step, EO and IR images are decomposed into different levels by 2D discrete wavelet transform. The wavelet coefficients at each level are combined by a set of fusion rules, such as min-max selection, mean-value, weighted summations, etc. Various fused images are obtained by inverse wavelet transform of combined coefficients. In the evaluation step, Sobel operator is applied on both the fused images and original images. Compared with original images, the remaining edge information in the fused each image is calculated as the fusion quality assessment. Finally, the fused image with the highest assessment value will be selected as the fusion result. In addition, the proposed method can adaptively select the best fusion rule for EO and IR images under different scenarios.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115257744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759691
Zheshen Wang, Baoxin Li
In this paper, we propose a novel approach for synchronizing multiple videos captured from common laparoscopic operations in simulation-based surgical training. The disparate video sources include two hand-view sequences and one tool-view sequence that does not contain any visual overlap with the hand views. The synchronization of the video is essential for further visual analysis tasks. To the best of our knowledge, there is no prior work dealing with synchronization of completely different visual streams capturing different aspects of the same physical event. In the proposed approach, histograms of dominant motion (HoDM) are extracted and used as features for each frame. Multi-view sequence correlation (MSC), computed as accumulated products of pairwise correlations of HoDM magnitudes and co-occurrence rate of pairwise HoDM orientation patterns, is proposed for ranking possible configurations of temporal alignment. The final relative shifts for synchronizing the videos are determined by maximizing both the overlap length of all sequences and the MSC scores through a coarse-to-fine search procedure. Experiments were performed on 41 groups of videos of two laparoscopic operations, and the performance was compared to start-of-the-art method, demonstrating the effectiveness of the proposed approach.
{"title":"Synchronizing disparate video streams from laparoscopic operations in simulation-based surgical training","authors":"Zheshen Wang, Baoxin Li","doi":"10.1109/AIPR.2010.5759691","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759691","url":null,"abstract":"In this paper, we propose a novel approach for synchronizing multiple videos captured from common laparoscopic operations in simulation-based surgical training. The disparate video sources include two hand-view sequences and one tool-view sequence that does not contain any visual overlap with the hand views. The synchronization of the video is essential for further visual analysis tasks. To the best of our knowledge, there is no prior work dealing with synchronization of completely different visual streams capturing different aspects of the same physical event. In the proposed approach, histograms of dominant motion (HoDM) are extracted and used as features for each frame. Multi-view sequence correlation (MSC), computed as accumulated products of pairwise correlations of HoDM magnitudes and co-occurrence rate of pairwise HoDM orientation patterns, is proposed for ranking possible configurations of temporal alignment. The final relative shifts for synchronizing the videos are determined by maximizing both the overlap length of all sequences and the MSC scores through a coarse-to-fine search procedure. Experiments were performed on 41 groups of videos of two laparoscopic operations, and the performance was compared to start-of-the-art method, demonstrating the effectiveness of the proposed approach.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131469948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759717
J. Caulfield, J. Havlicek
As large format single color and multicolor sensors proliferate challenges in viewing and taking action on a much higher data volume becomes challenging to end users. We report on processing techniques to effectively extract targets utilizing multiple processing modes — both multiple spectral band and multiple pre-processed data bands from Focal Plane Array (FPA) sensors. We have developed image processing techniques which address many of the key pre-processing requirements, including scene based non-uniformity correction of static and dynamic pixels, multiband processing for object detection, and reduction and management of clutter and non-targets in a cluttered environment. Key motivations for these techniques include image pre-processing extracting small percentages of the image set with potentially high likelihood targets and then transmitting “active” pixel data while ignoring unchanging pixels. These techniques have demonstrated significant reductions in the raw data, and allow the end user to more intelligently select potential data types for object identification without requiring a person in the loop.
{"title":"Advanced image processing techniques for extracting regions of interest using multimode IR processing","authors":"J. Caulfield, J. Havlicek","doi":"10.1109/AIPR.2010.5759717","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759717","url":null,"abstract":"As large format single color and multicolor sensors proliferate challenges in viewing and taking action on a much higher data volume becomes challenging to end users. We report on processing techniques to effectively extract targets utilizing multiple processing modes — both multiple spectral band and multiple pre-processed data bands from Focal Plane Array (FPA) sensors. We have developed image processing techniques which address many of the key pre-processing requirements, including scene based non-uniformity correction of static and dynamic pixels, multiband processing for object detection, and reduction and management of clutter and non-targets in a cluttered environment. Key motivations for these techniques include image pre-processing extracting small percentages of the image set with potentially high likelihood targets and then transmitting “active” pixel data while ignoring unchanging pixels. These techniques have demonstrated significant reductions in the raw data, and allow the end user to more intelligently select potential data types for object identification without requiring a person in the loop.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128707536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759698
P. Willson, Gabriel Chan, Paul Yun
The hyperspectral space consisting of narrow spectral bands is neither an optimal nor an orthogonal feature space when identifying objects. In this paper we consider a means of reducing hyperspectral feature space to a multispectral feature space that is orthogonal and optimal for separation of the objects from background. The motivation for this work is derived from the fact that the retina of the human eye uses only four broad and overlapping spectral response functions and yet it is optimal for detecting objects of multifarious colors in the visible region. In this paper we explore using spectral response functions for the Short Wave Infrared (SWIR) region that are not sharp, but broad and overlapping and even more complex than those found in the retina for the visible region. Treating the measured intensities of the narrow spectral bands as feature vectors of the object of interest, we calculate a new vector space which effectively is a weighted average of the old space, but is optimal for separating the object from the background.
{"title":"Vision physiology applied to hyperspectral short wave infrared imaging","authors":"P. Willson, Gabriel Chan, Paul Yun","doi":"10.1109/AIPR.2010.5759698","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759698","url":null,"abstract":"The hyperspectral space consisting of narrow spectral bands is neither an optimal nor an orthogonal feature space when identifying objects. In this paper we consider a means of reducing hyperspectral feature space to a multispectral feature space that is orthogonal and optimal for separation of the objects from background. The motivation for this work is derived from the fact that the retina of the human eye uses only four broad and overlapping spectral response functions and yet it is optimal for detecting objects of multifarious colors in the visible region. In this paper we explore using spectral response functions for the Short Wave Infrared (SWIR) region that are not sharp, but broad and overlapping and even more complex than those found in the retina for the visible region. Treating the measured intensities of the narrow spectral bands as feature vectors of the object of interest, we calculate a new vector space which effectively is a weighted average of the old space, but is optimal for separating the object from the background.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133533520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759699
R. Madison, Yuetian Xu
The current proliferation of Unmanned Aircraft Systems provides an increasing amount of full-motion video (FMV) that, among other things, encodes geospatial intelligence. But the FMV is rarely converted into useful products, thus the intel potential is wasted. We have developed four concept demonstrations of methods to convert FMV into more immediately useful products, including: more accurate coordinates for objects of interest; timely, geo-registered, orthorectified imagery; conversion of mouse-clicks to object coordinates; and first-person-perspective visualization of graphical control measures. We believe these concepts can convey valuable geospatial intelligence to the tactical user.
{"title":"Tactical geospatial intelligence from full motion video","authors":"R. Madison, Yuetian Xu","doi":"10.1109/AIPR.2010.5759699","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759699","url":null,"abstract":"The current proliferation of Unmanned Aircraft Systems provides an increasing amount of full-motion video (FMV) that, among other things, encodes geospatial intelligence. But the FMV is rarely converted into useful products, thus the intel potential is wasted. We have developed four concept demonstrations of methods to convert FMV into more immediately useful products, including: more accurate coordinates for objects of interest; timely, geo-registered, orthorectified imagery; conversion of mouse-clicks to object coordinates; and first-person-perspective visualization of graphical control measures. We believe these concepts can convey valuable geospatial intelligence to the tactical user.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"288 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115892413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}