首页 > 最新文献

2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)最新文献

英文 中文
Sign Recognition - How well does Single Shot Multibox Detector sum up? A Quantitative Study 标志识别——单枪多盒探测器总结起来有多好?定量研究
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707409
Manikandan Ravikiran
Deep learning in traffic sign detection & recognition (TSDR) is widely explored in recent times due to its ability to produce state-of-the-art results and availability of public datasets. Two different architectures of detection networks are currently being developed: Single Shot and Region Proposal based approaches. Even though for the case of traffic sign detection, single shot method seem adequate, very few works to date has investigated this hypothesis quantitatively, with most works focusing on region proposal based detection architectures. Moreover, with the complexity of the TSDR task and limited performance of region proposal based approaches, a quantitative study of the single shot method is warranted which would, in turn, reveal its strengths and weakness for TSDR. As such in this paper, we revisit this topic through quantitative evaluation of state-of-the-art Single Shot Multibox Detector (SSD) on multiple standard benchmarks. More specifically, we try to quantify 1) Performance of SSD over multiple existing TSDR benchmarks namely GTSDB, STSDB and BTSDB 2) Generalization of SSD across the datasets 3) Impact of class overlap on SSD’s performance 4) Performance of SSD from synthetically generated datasets using Wikipedia Images. Through our study, we show that 1) SSD can reach performance >0.92 AUC for TSDR across standard benchmarks and in the process, we introduce new benchmarks for Romania(RTSDB) and Finland(FTSDB) in line with GTSDB 2) SSD model pretrained on GTSDB generalizes well for BTSDB and RTSDB with average AUC of 0.90 and comparatively lower for Sweden and Finland datasets. We find that scale selection and information loss as the primary reason for the limited generalization. In the due process, to address these issues we propose a convex optimization-based scale selection and Skip SSD - An architecture developed based on the concept of feature reuse leading to improvement in generalization. We also show that 3) SSD model augmented with small synthetically generated dataset produces close to state-of-the-art accuracy across GTSDB, STSDB and BTSDB 4) Class overlap is indeed a challenging problem to be addressed even in case of SSD. Further, we show detailed experiments and summarize our practical findings for those interested in getting the most out of SSD for TSDR.
目前正在开发两种不同的检测网络架构:单镜头和基于区域建议的方法。尽管对于交通标志检测来说,单镜头方法似乎是足够的,但迄今为止很少有研究对这一假设进行定量研究,大多数研究都集中在基于区域建议的检测架构上。此外,由于TSDR任务的复杂性和基于区域建议的方法的有限性能,有必要对单次射击方法进行定量研究,从而揭示其在TSDR中的优缺点。因此,在本文中,我们通过对最先进的单镜头多盒探测器(SSD)在多个标准基准上的定量评估来重新审视这个主题。更具体地说,我们试图量化1)SSD在多个现有TSDR基准(即GTSDB, STSDB和BTSDB)上的性能2)SSD在数据集上的泛化3)类重叠对SSD性能的影响4)使用维基百科图像合成生成数据集的SSD性能。通过我们的研究,我们发现:1)SSD在TSDR的标准基准测试中可以达到0.92 AUC的性能,在此过程中,我们引入了符合GTSDB的罗马尼亚(RTSDB)和芬兰(FTSDB)的新基准测试;2)在GTSDB上预训练的SSD模型对BTSDB和RTSDB有很好的泛化,平均AUC为0.90,瑞典和芬兰数据集的AUC相对较低。我们发现尺度选择和信息丢失是泛化受限的主要原因。在适当的过程中,为了解决这些问题,我们提出了一种基于凸优化的规模选择和跳过SSD——一种基于特征重用概念开发的架构,从而提高了泛化能力。我们还表明,3)SSD模型与小型合成生成的数据集增强后,在GTSDB, STSDB和BTSDB之间产生接近最先进的精度4)类重叠确实是一个具有挑战性的问题,即使在SSD的情况下也需要解决。此外,我们展示了详细的实验,并总结了我们的实际发现,为那些有兴趣充分利用SSD的TSDR。
{"title":"Sign Recognition - How well does Single Shot Multibox Detector sum up? A Quantitative Study","authors":"Manikandan Ravikiran","doi":"10.1109/AIPR.2018.8707409","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707409","url":null,"abstract":"Deep learning in traffic sign detection & recognition (TSDR) is widely explored in recent times due to its ability to produce state-of-the-art results and availability of public datasets. Two different architectures of detection networks are currently being developed: Single Shot and Region Proposal based approaches. Even though for the case of traffic sign detection, single shot method seem adequate, very few works to date has investigated this hypothesis quantitatively, with most works focusing on region proposal based detection architectures. Moreover, with the complexity of the TSDR task and limited performance of region proposal based approaches, a quantitative study of the single shot method is warranted which would, in turn, reveal its strengths and weakness for TSDR. As such in this paper, we revisit this topic through quantitative evaluation of state-of-the-art Single Shot Multibox Detector (SSD) on multiple standard benchmarks. More specifically, we try to quantify 1) Performance of SSD over multiple existing TSDR benchmarks namely GTSDB, STSDB and BTSDB 2) Generalization of SSD across the datasets 3) Impact of class overlap on SSD’s performance 4) Performance of SSD from synthetically generated datasets using Wikipedia Images. Through our study, we show that 1) SSD can reach performance >0.92 AUC for TSDR across standard benchmarks and in the process, we introduce new benchmarks for Romania(RTSDB) and Finland(FTSDB) in line with GTSDB 2) SSD model pretrained on GTSDB generalizes well for BTSDB and RTSDB with average AUC of 0.90 and comparatively lower for Sweden and Finland datasets. We find that scale selection and information loss as the primary reason for the limited generalization. In the due process, to address these issues we propose a convex optimization-based scale selection and Skip SSD - An architecture developed based on the concept of feature reuse leading to improvement in generalization. We also show that 3) SSD model augmented with small synthetically generated dataset produces close to state-of-the-art accuracy across GTSDB, STSDB and BTSDB 4) Class overlap is indeed a challenging problem to be addressed even in case of SSD. Further, we show detailed experiments and summarize our practical findings for those interested in getting the most out of SSD for TSDR.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127410326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Object Recognition under Lighting Variations using Pre-Trained Networks 使用预训练网络的光照变化下的物体识别
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707399
Kalpathy Sivaraman, A. Murthy
We report the object-recognition performance of VGG16, ResNet, and SqueezeNet, three state-of-the-art Convolutional Neural Networks (CNNs) trained on ImageNet, across 15 different lighting conditions using the Phos dataset and a ResNet-like network trained on Pascal VOC on the ExDark dataset. The instabilities in the normalized softmax values are used to highlight that pre-trained networks are not robust to lighting variations. Our investigation yields a robustness analysis framework for analyzing the performance of CNNs under different lighting conditions.The Phos dataset consists of 15 scenes captured under different illumination conditions: 9 images captured under various strengths of uniform illumination, and 6 images under different degrees of non-uniform illumination. The ExDARK dataset consists of ten scenes under different illumination conditions. A Keras-based pipeline was developed to study the softmax values output by ImageNet-trained VGG16, ResNet, and SqueezeNet for the same object under the 15 different lighting conditions of the Phos dataset. A ResNet architecture was trained end-to-end on the PASCAL VOC dataset. Large variations observed in the softmax values provide empirical evidence of unstable performance and the need to augment training to account for lighting variations.
我们报告了在ImageNet上训练的三个最先进的卷积神经网络(cnn) VGG16、ResNet和SqueezeNet在15种不同光照条件下的目标识别性能,使用Phos数据集和在ExDark数据集上使用Pascal VOC训练的类似ResNet的网络。标准化softmax值的不稳定性被用来强调预训练的网络对光照变化的鲁棒性。我们的研究产生了一个鲁棒性分析框架,用于分析cnn在不同照明条件下的性能。Phos数据集包括在不同光照条件下拍摄的15个场景:9幅在不同强度均匀光照下拍摄的图像,6幅在不同程度非均匀光照下拍摄的图像。ExDARK数据集由10个不同光照条件下的场景组成。开发了一个基于keras的管道来研究imagenet训练的VGG16、ResNet和SqueezeNet在Phos数据集的15种不同光照条件下对同一物体输出的softmax值。在PASCAL VOC数据集上对ResNet架构进行端到端训练。在softmax值中观察到的大变化提供了不稳定性能的经验证据,并且需要增加训练以解释照明变化。
{"title":"Object Recognition under Lighting Variations using Pre-Trained Networks","authors":"Kalpathy Sivaraman, A. Murthy","doi":"10.1109/AIPR.2018.8707399","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707399","url":null,"abstract":"We report the object-recognition performance of VGG16, ResNet, and SqueezeNet, three state-of-the-art Convolutional Neural Networks (CNNs) trained on ImageNet, across 15 different lighting conditions using the Phos dataset and a ResNet-like network trained on Pascal VOC on the ExDark dataset. The instabilities in the normalized softmax values are used to highlight that pre-trained networks are not robust to lighting variations. Our investigation yields a robustness analysis framework for analyzing the performance of CNNs under different lighting conditions.The Phos dataset consists of 15 scenes captured under different illumination conditions: 9 images captured under various strengths of uniform illumination, and 6 images under different degrees of non-uniform illumination. The ExDARK dataset consists of ten scenes under different illumination conditions. A Keras-based pipeline was developed to study the softmax values output by ImageNet-trained VGG16, ResNet, and SqueezeNet for the same object under the 15 different lighting conditions of the Phos dataset. A ResNet architecture was trained end-to-end on the PASCAL VOC dataset. Large variations observed in the softmax values provide empirical evidence of unstable performance and the need to augment training to account for lighting variations.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132335427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Improving Nuclei Classification Performance in H&E Stained Tissue Images Using Fully Convolutional Regression Network and Convolutional Neural Network 利用全卷积回归网络和卷积神经网络改进H&E染色组织图像的核分类性能
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707397
Ali S. Hamad, I. Ersoy, F. Bunyak
Detection and classification of nuclei in histopathology images is an important step in the research of understanding tumor microenvironment and evaluating cancer progression and prognosis. The task is challenging due to imaging factors such as varying cell morphologies, batch-to-batch variations in staining, and sample preparation. We present a two-stage deep learning pipeline that combines a Fully Convolutional Regression Network (FCRN) that performs nuclei localization with a Convolution Neural Network (CNN) that performs nuclei classification. Instead of using hand-crafted features, the system learns the visual features needed for detection and classification of nuclei making the process robust to the aforementioned variations. The performance of the proposed system has been quantitatively evaluated on images of hematoxylin and eosin (H&E) stained colon cancer tissues and compared to the previous studies using the same data set. The proposed deep learning system produces promising results for detection and classification of nuclei in histopathology images.
组织病理图像中细胞核的检测和分类是了解肿瘤微环境、评价肿瘤进展和预后的重要一步。由于不同的细胞形态、染色批次之间的差异和样品制备等成像因素,这项任务具有挑战性。我们提出了一个两阶段的深度学习管道,它结合了执行核定位的全卷积回归网络(FCRN)和执行核分类的卷积神经网络(CNN)。该系统不使用手工制作的特征,而是学习检测和分类核所需的视觉特征,使该过程对上述变化具有鲁棒性。该系统的性能已在苏木精和伊红(H&E)染色的结肠癌组织图像上进行了定量评估,并与先前使用相同数据集的研究进行了比较。所提出的深度学习系统在组织病理学图像的细胞核检测和分类方面产生了有希望的结果。
{"title":"Improving Nuclei Classification Performance in H&E Stained Tissue Images Using Fully Convolutional Regression Network and Convolutional Neural Network","authors":"Ali S. Hamad, I. Ersoy, F. Bunyak","doi":"10.1109/AIPR.2018.8707397","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707397","url":null,"abstract":"Detection and classification of nuclei in histopathology images is an important step in the research of understanding tumor microenvironment and evaluating cancer progression and prognosis. The task is challenging due to imaging factors such as varying cell morphologies, batch-to-batch variations in staining, and sample preparation. We present a two-stage deep learning pipeline that combines a Fully Convolutional Regression Network (FCRN) that performs nuclei localization with a Convolution Neural Network (CNN) that performs nuclei classification. Instead of using hand-crafted features, the system learns the visual features needed for detection and classification of nuclei making the process robust to the aforementioned variations. The performance of the proposed system has been quantitatively evaluated on images of hematoxylin and eosin (H&E) stained colon cancer tissues and compared to the previous studies using the same data set. The proposed deep learning system produces promising results for detection and classification of nuclei in histopathology images.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116227523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Line Segments based Rotation Invariant Descriptor for Disparate Images 基于线段的异构图像旋转不变量描述符
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707401
Teena Sharma, P. Agrawal, Piyush Sahoo, N. Verma, S. Vasikarla
Computer vision-based real-time applications demand robust image matching approaches due to disparity in images. This can be achieved using descriptor vector with scale and rotation invariance capability. This paper presents a rotation invariant descriptor vector formation based on line point duality. The proposed descriptor uses a simple consistent method of key point detection. For obtaining the descriptor vector, line segments present in the input image are used. These line segments are located within a region of interest around obtained key points in the input image. The obtained descriptor vector is used for matching of disparate images. Experiments are carried out for four different image sets with rotation at the range of angles to validate the performance of the proposed descriptor in real-time. For comparative study, normalized match ratio is computed using multi-layered neural network with two hidden layers.
由于图像的差异,基于计算机视觉的实时应用需要鲁棒的图像匹配方法。这可以使用具有缩放和旋转不变性能力的描述子向量来实现。提出了一种基于线点对偶的旋转不变描述子向量生成方法。所提出的描述符使用一种简单一致的关键点检测方法。为了获得描述符向量,使用输入图像中存在的线段。这些线段位于输入图像中获得的关键点周围的感兴趣区域内。得到的描述符向量用于不同图像的匹配。在四个不同的图像集上进行了不同角度的旋转实验,以验证所提出的描述符的实时性能。为了进行对比研究,采用两隐层多层神经网络计算归一化匹配比。
{"title":"Line Segments based Rotation Invariant Descriptor for Disparate Images","authors":"Teena Sharma, P. Agrawal, Piyush Sahoo, N. Verma, S. Vasikarla","doi":"10.1109/AIPR.2018.8707401","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707401","url":null,"abstract":"Computer vision-based real-time applications demand robust image matching approaches due to disparity in images. This can be achieved using descriptor vector with scale and rotation invariance capability. This paper presents a rotation invariant descriptor vector formation based on line point duality. The proposed descriptor uses a simple consistent method of key point detection. For obtaining the descriptor vector, line segments present in the input image are used. These line segments are located within a region of interest around obtained key points in the input image. The obtained descriptor vector is used for matching of disparate images. Experiments are carried out for four different image sets with rotation at the range of angles to validate the performance of the proposed descriptor in real-time. For comparative study, normalized match ratio is computed using multi-layered neural network with two hidden layers.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117255886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Evaluating Video-based Generative Adversarial Networks (GANs) 基于视频的生成对抗网络(GANs)评估
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707431
N. Ronquillo, Josh Harguess
We study the problem of evaluating video-based Generative Adversarial Networks (GANs) by applying existing image quality assessment methods to the explicit evaluation of videos generated by state-of-the-art frameworks [1]–[3]. Specifically, we provide results and discussion on using quantitative methods such as the Fréchet Inception Distance [4], the Multi-scale Structural Similarity Measure (MS-SSIM) [5], as well as the Birthday Paradox inspired test [6] and compare these to the prevalent performance evaluation methods in the literature. We summarize that current testing methodologies are not sufficient for quality assurance in video-based GAN frameworks, and that methods based on the image-based GAN literature can be useful to consider. The results of our experiments and a discussion on evaluating video-based GANs provide key insight that may be useful in generating new measures of quality assurance in future work.
我们通过将现有的图像质量评估方法应用于由最先进的框架生成的视频的显式评估,研究了评估基于视频的生成对抗网络(gan)的问题[1]-[3]。具体而言,我们提供了使用定量方法的结果和讨论,如fr起始距离[4]、多尺度结构相似性度量(MS-SSIM)[5]以及生日悖论启发的测试[6],并将这些方法与文献中流行的绩效评估方法进行了比较。我们总结说,目前的测试方法不足以保证基于视频的GAN框架的质量,而基于图像的GAN文献的方法可以考虑。我们的实验结果和对评估基于视频的gan的讨论提供了关键的见解,这些见解可能有助于在未来的工作中产生新的质量保证措施。
{"title":"On Evaluating Video-based Generative Adversarial Networks (GANs)","authors":"N. Ronquillo, Josh Harguess","doi":"10.1109/AIPR.2018.8707431","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707431","url":null,"abstract":"We study the problem of evaluating video-based Generative Adversarial Networks (GANs) by applying existing image quality assessment methods to the explicit evaluation of videos generated by state-of-the-art frameworks [1]–[3]. Specifically, we provide results and discussion on using quantitative methods such as the Fréchet Inception Distance [4], the Multi-scale Structural Similarity Measure (MS-SSIM) [5], as well as the Birthday Paradox inspired test [6] and compare these to the prevalent performance evaluation methods in the literature. We summarize that current testing methodologies are not sufficient for quality assurance in video-based GAN frameworks, and that methods based on the image-based GAN literature can be useful to consider. The results of our experiments and a discussion on evaluating video-based GANs provide key insight that may be useful in generating new measures of quality assurance in future work.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123464290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Understanding effects of atmospheric variables on spectral vegetation indices derived from satellite based time series of multispectral images 了解大气变量对基于卫星多光谱影像时间序列的光谱植被指数的影响
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707430
Aleem Khaliq, M. Musci, M. Chiaberge
In agricultural practices, it is very essential to monitor crops phenological pattern over the time to manage agronomic activities such as irrigation, weed control, pest control, fertilization, drainage system etc. From the past decade, due to free availability of data and large coverage area, satellite based remote sensing has been most popular and widely used among other techniques such as physical ground surveys, ground based sensors and aerial based remote sensing. Sentinel-2 is European based satellite equipped with the state of the art multispectral imager which offers high spectral resolution (13- spectral bands), high spatial resolution (up to 10m pixel-1) and good temporal resolution (6 to 10days). Considering these features, time series of multispectral images of sentinel-2 has been used to establish temporal pattern of spectral vegetation indices (i.e. NDVI, SAVI, EVI, RVI) of crops to monitor the phenological behavior over time. In addition, the influence of various atmospheric variables (such as temperature in the air and precipitation ) on the derived spectral vegetation indices has also been investigated in this work. Land use and coverage area frame survey (LUCAS-2015) has been used as ground reference data for this study. This study shows that by using sentinel-2, understanding relation between atmospheric conditions and crops phenological behavior can be useful to manage agricultural activities.
在农业实践中,长期监测作物物候模式对灌溉、杂草防治、病虫害防治、施肥、排水系统等农业活动的管理至关重要。近十年来,由于数据的可获得性和覆盖面积大,卫星遥感已成为物理地面测量、地面传感器和航空遥感等技术中最受欢迎和广泛应用的技术。哨兵2号是欧洲的卫星,配备了最先进的多光谱成像仪,提供高光谱分辨率(13个光谱波段)、高空间分辨率(高达10米像素-1)和良好的时间分辨率(6至10天)。考虑到这些特点,利用sentinel-2多光谱影像的时间序列,建立作物植被光谱指数(NDVI、SAVI、EVI、RVI)的时间格局,监测作物的物候行为。此外,本文还研究了各种大气变量(如大气温度和降水)对反演的光谱植被指数的影响。土地利用和覆盖面积框架调查(LUCAS-2015)作为本研究的地面参考数据。这项研究表明,通过使用sentinel-2,了解大气条件与作物物候行为之间的关系可以帮助管理农业活动。
{"title":"Understanding effects of atmospheric variables on spectral vegetation indices derived from satellite based time series of multispectral images","authors":"Aleem Khaliq, M. Musci, M. Chiaberge","doi":"10.1109/AIPR.2018.8707430","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707430","url":null,"abstract":"In agricultural practices, it is very essential to monitor crops phenological pattern over the time to manage agronomic activities such as irrigation, weed control, pest control, fertilization, drainage system etc. From the past decade, due to free availability of data and large coverage area, satellite based remote sensing has been most popular and widely used among other techniques such as physical ground surveys, ground based sensors and aerial based remote sensing. Sentinel-2 is European based satellite equipped with the state of the art multispectral imager which offers high spectral resolution (13- spectral bands), high spatial resolution (up to 10m pixel-1) and good temporal resolution (6 to 10days). Considering these features, time series of multispectral images of sentinel-2 has been used to establish temporal pattern of spectral vegetation indices (i.e. NDVI, SAVI, EVI, RVI) of crops to monitor the phenological behavior over time. In addition, the influence of various atmospheric variables (such as temperature in the air and precipitation ) on the derived spectral vegetation indices has also been investigated in this work. Land use and coverage area frame survey (LUCAS-2015) has been used as ground reference data for this study. This study shows that by using sentinel-2, understanding relation between atmospheric conditions and crops phenological behavior can be useful to manage agricultural activities.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123073246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SAR Target Recognition with Deep Learning 基于深度学习的SAR目标识别
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707419
Ryan J. Soldin
The automated detection and classification of objects in imagery is an important topic for many applications in remote sensing. These can include the counting of cars and ships and the tracking of military vehicles for the defense and intelligence industry. Synthetic aperture radar (SAR) provides day/night and all-weather imaging capabilities. SAR is a powerful data source for Deep Learning (DL) algorithms to provide automatic target recognition (ATR) capabilities. DL classification was shown to be extremely effective on multi-spectral satellite imagery during the IARPA Functional Map of the World (fMoW). In our work we look to extend these techniques to SAR. We start by applying ResNet-18 to the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset. The MSTAR program, sponsored by DARPA and AFRL, consists of SAR collections of military style targets using an aerial X-band radar with one-foot resolution. We achieved an overall classification accuracy of 99% on 10 different classes of targets, confirming previously published results. We then extend this classifier to investigate an emerging target and the effects of limited training data on system performance.
图像中目标的自动检测与分类是遥感领域的一个重要课题。其中包括为国防和情报行业计算汽车和船只数量,以及追踪军用车辆。合成孔径雷达(SAR)提供昼/夜和全天候成像能力。SAR是深度学习(DL)算法提供自动目标识别(ATR)功能的强大数据源。在IARPA世界功能地图(fMoW)期间,DL分类在多光谱卫星图像上显示出非常有效的效果。在我们的工作中,我们希望将这些技术扩展到SAR。我们首先将ResNet-18应用于运动和静止目标获取和识别(MSTAR)数据集。MSTAR项目由DARPA和AFRL赞助,由使用一英尺分辨率的空中x波段雷达的军用风格目标的SAR集合组成。我们在10个不同类别的目标上实现了99%的总体分类准确率,证实了之前发表的结果。然后我们扩展这个分类器来研究一个新兴的目标和有限的训练数据对系统性能的影响。
{"title":"SAR Target Recognition with Deep Learning","authors":"Ryan J. Soldin","doi":"10.1109/AIPR.2018.8707419","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707419","url":null,"abstract":"The automated detection and classification of objects in imagery is an important topic for many applications in remote sensing. These can include the counting of cars and ships and the tracking of military vehicles for the defense and intelligence industry. Synthetic aperture radar (SAR) provides day/night and all-weather imaging capabilities. SAR is a powerful data source for Deep Learning (DL) algorithms to provide automatic target recognition (ATR) capabilities. DL classification was shown to be extremely effective on multi-spectral satellite imagery during the IARPA Functional Map of the World (fMoW). In our work we look to extend these techniques to SAR. We start by applying ResNet-18 to the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset. The MSTAR program, sponsored by DARPA and AFRL, consists of SAR collections of military style targets using an aerial X-band radar with one-foot resolution. We achieved an overall classification accuracy of 99% on 10 different classes of targets, confirming previously published results. We then extend this classifier to investigate an emerging target and the effects of limited training data on system performance.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123467722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Performance Evaluation of Feature Descriptors for Aerial Imagery Mosaicking 航空图像拼接特征描述符的性能评价
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707402
Rumana Aktar, H. Aliakbarpour, F. Bunyak, G. Seetharaman, K. Palaniappan
Mosaicking enables efficient summary of geospatial content in an aerial video with applications in surveillance, activity detection, tracking, etc. Scene clutter, presence of distractors, parallax, illumination artifacts i.e. shadows, glare, and other complexities of aerial imaging such as large camera motion makes the registration process challenging. Robust feature detection and description is needed to overcome these challenges before registration. This study investigates the computational complexity versus performance of selected feature detectors such as Structure Tensor with NCC (ST+NCC), SURF, ASIFT within our Video Mosaicking and Summarization (VMZ) framework on VIRAT benchmark aerial video. ST+NCC and SURF is very fast but fails for few complex imagery (with occlusion) from VIRAT. ASIFT is more robust compared to ST+NCC or SURF, though extremely time consuming. We also propose an Adaptive Descriptor (combining ST+NCC and ASIFT) that is 9x faster than ASIFT with comparable robustness.
拼接技术可以有效地总结航拍视频中的地理空间内容,并应用于监视、活动检测、跟踪等领域。场景杂乱、干扰物、视差、照明伪影(即阴影、眩光)和其他航空成像的复杂性(如大摄像机运动)使注册过程具有挑战性。为了克服这些挑战,需要在注册前进行稳健的特征检测和描述。本研究在VIRAT基准航拍视频的视频拼接和总结(VMZ)框架中,研究了选定的特征检测器(如带NCC的结构张量(ST+NCC)、SURF、ASIFT)的计算复杂度与性能。ST+NCC和SURF速度非常快,但对于来自VIRAT的少数复杂图像(有遮挡)则失败。ASIFT比ST+NCC或SURF更强大,尽管非常耗时。我们还提出了一种自适应描述符(结合ST+NCC和ASIFT),它比ASIFT快9倍,具有相当的鲁棒性。
{"title":"Performance Evaluation of Feature Descriptors for Aerial Imagery Mosaicking","authors":"Rumana Aktar, H. Aliakbarpour, F. Bunyak, G. Seetharaman, K. Palaniappan","doi":"10.1109/AIPR.2018.8707402","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707402","url":null,"abstract":"Mosaicking enables efficient summary of geospatial content in an aerial video with applications in surveillance, activity detection, tracking, etc. Scene clutter, presence of distractors, parallax, illumination artifacts i.e. shadows, glare, and other complexities of aerial imaging such as large camera motion makes the registration process challenging. Robust feature detection and description is needed to overcome these challenges before registration. This study investigates the computational complexity versus performance of selected feature detectors such as Structure Tensor with NCC (ST+NCC), SURF, ASIFT within our Video Mosaicking and Summarization (VMZ) framework on VIRAT benchmark aerial video. ST+NCC and SURF is very fast but fails for few complex imagery (with occlusion) from VIRAT. ASIFT is more robust compared to ST+NCC or SURF, though extremely time consuming. We also propose an Adaptive Descriptor (combining ST+NCC and ASIFT) that is 9x faster than ASIFT with comparable robustness.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124811134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Automated Annotation of Satellite Imagery using Model-based Projections 使用基于模型的投影的卫星图像自动标注
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707425
R. Roberts, J. Goforth, G. Weinert, C. Grant, Will R. Ray, B. Stinson, Andrew M. Duncan
GeoVisipedia is a new and novel approach to annotating satellite imagery. It uses wiki pages to annotate objects rather than simple labels. The use of wiki pages to contain annotations is particularly useful for annotating objects in imagery of complex geospatial configurations such as industrial facilities. GeoVisipedia uses the PRISM algorithm to project annotations applied to one image to other imagery, hence enabling ubiquitous annotation. This paper derives the PRISM algorithm, which uses image metadata and a 3D facility model to create a view matrix unique to each image. The view matrix is used to project model components onto a mask which aligns the components with the objects in the scene that they represent. Wiki pages are linked to model components, which are in turn linked to the image via the component mask. An illustration of the efficacy of the PRISM algorithm is provided, demonstrating the projection of model components onto an effluent stack. We conclude with a discussion of the efficiencies of GeoVisipedia over manual annotation, and the use of PRISM for creating training sets for machine learning algorithms.
GeoVisipedia是一个新的和新颖的方法来注释卫星图像。它使用wiki页面来注释对象,而不是简单的标签。使用wiki页面来包含注释对于在复杂地理空间配置(如工业设施)的图像中注释对象特别有用。GeoVisipedia使用PRISM算法将应用于一张图像的注释投影到其他图像,从而实现无处不在的注释。本文导出了PRISM算法,该算法利用图像元数据和三维设施模型来创建每个图像独有的视图矩阵。视图矩阵用于将模型组件投影到遮罩上,遮罩将组件与它们所代表的场景中的对象对齐。Wiki页面链接到模型组件,模型组件又通过组件掩码链接到图像。提供了PRISM算法有效性的示例,演示了模型组件在流出物堆栈上的投影。最后,我们讨论了GeoVisipedia相对于手动注释的效率,以及使用PRISM创建机器学习算法的训练集。
{"title":"Automated Annotation of Satellite Imagery using Model-based Projections","authors":"R. Roberts, J. Goforth, G. Weinert, C. Grant, Will R. Ray, B. Stinson, Andrew M. Duncan","doi":"10.1109/AIPR.2018.8707425","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707425","url":null,"abstract":"GeoVisipedia is a new and novel approach to annotating satellite imagery. It uses wiki pages to annotate objects rather than simple labels. The use of wiki pages to contain annotations is particularly useful for annotating objects in imagery of complex geospatial configurations such as industrial facilities. GeoVisipedia uses the PRISM algorithm to project annotations applied to one image to other imagery, hence enabling ubiquitous annotation. This paper derives the PRISM algorithm, which uses image metadata and a 3D facility model to create a view matrix unique to each image. The view matrix is used to project model components onto a mask which aligns the components with the objects in the scene that they represent. Wiki pages are linked to model components, which are in turn linked to the image via the component mask. An illustration of the efficacy of the PRISM algorithm is provided, demonstrating the projection of model components onto an effluent stack. We conclude with a discussion of the efficiencies of GeoVisipedia over manual annotation, and the use of PRISM for creating training sets for machine learning algorithms.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114755744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automated Video Interpretability Assessment using Convolutional Neural Networks 使用卷积神经网络的自动视频可解释性评估
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707423
A. Kalukin
A neural network used to automate assessment of video quality, as measured by the Video National Imagery Interpretability Rating Scale (VNIIRS), was able to ascertain the exact VNIIRS rating over 80% of the time.
一个用于自动评估视频质量的神经网络,通过视频国家图像可解释性评级量表(VNIIRS)来衡量,能够在80%的时间内确定准确的VNIIRS评级。
{"title":"Automated Video Interpretability Assessment using Convolutional Neural Networks","authors":"A. Kalukin","doi":"10.1109/AIPR.2018.8707423","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707423","url":null,"abstract":"A neural network used to automate assessment of video quality, as measured by the Video National Imagery Interpretability Rating Scale (VNIIRS), was able to ascertain the exact VNIIRS rating over 80% of the time.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"519 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133227853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1