Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707429
David Kornish, Justin Geary, Victor Sansing, Soundararajan Ezekiel, Larry Pearlstein, L. Njilla
In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.
{"title":"Malware Classification using Deep Convolutional Neural Networks","authors":"David Kornish, Justin Geary, Victor Sansing, Soundararajan Ezekiel, Larry Pearlstein, L. Njilla","doi":"10.1109/AIPR.2018.8707429","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707429","url":null,"abstract":"In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115011057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707390
David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia
Deep convolutional neural networks have recently demonstrated incredible capabilities in areas such as image classification and object detection, but they require large datasets of quality pre-labeled data to achieve high levels of performance. Almost all data is not properly labeled when it is captured, and the process of manually labeling large enough datasets for effective learning is impractical in many real-world applications. New studies have shown that synthetic data, generated from a simulated environment, can be effective training data for DCNNs. However, synthetic data is only as effective as the simulation from which it is gathered, and there is often a significant trade-off between designing a simulation that properly models real-world conditions and simply gathering better real-world data. Using generative network architectures, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), it is possible to produce new synthetic samples based on the features of real-world data. This data can be used to augment small datasets to increase DCNN performance, similar to traditional augmentation methods such as scaling, translation, rotation, and adding noise. In this paper, we compare the advantages of synthetic data from GANs and VAEs to traditional data augmentation techniques. Initial results are promising, indicating that using synthetic data for augmentation can improve the accuracy of DCNN classifiers.
{"title":"DCNN Augmentation via Synthetic Data from Variational Autoencoders and Generative Adversarial Networks","authors":"David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia","doi":"10.1109/AIPR.2018.8707390","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707390","url":null,"abstract":"Deep convolutional neural networks have recently demonstrated incredible capabilities in areas such as image classification and object detection, but they require large datasets of quality pre-labeled data to achieve high levels of performance. Almost all data is not properly labeled when it is captured, and the process of manually labeling large enough datasets for effective learning is impractical in many real-world applications. New studies have shown that synthetic data, generated from a simulated environment, can be effective training data for DCNNs. However, synthetic data is only as effective as the simulation from which it is gathered, and there is often a significant trade-off between designing a simulation that properly models real-world conditions and simply gathering better real-world data. Using generative network architectures, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), it is possible to produce new synthetic samples based on the features of real-world data. This data can be used to augment small datasets to increase DCNN performance, similar to traditional augmentation methods such as scaling, translation, rotation, and adding noise. In this paper, we compare the advantages of synthetic data from GANs and VAEs to traditional data augmentation techniques. Initial results are promising, indicating that using synthetic data for augmentation can improve the accuracy of DCNN classifiers.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117122942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707417
A. Schaum
We describe a new approach to solving binary composite hypothesis testing problems and prove its equivalence to a constrained form of clairvoyant fusion. The constraint resolves an abiding theoretical conundrum: the non-commutativity of the fusion order. We then illustrate use of the new constraint by addressing a common limitation in image-based spectral detection, false alarms caused by outliers.
{"title":"Principles of Dual Fusion Detection","authors":"A. Schaum","doi":"10.1109/AIPR.2018.8707417","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707417","url":null,"abstract":"We describe a new approach to solving binary composite hypothesis testing problems and prove its equivalence to a constrained form of clairvoyant fusion. The constraint resolves an abiding theoretical conundrum: the non-commutativity of the fusion order. We then illustrate use of the new constraint by addressing a common limitation in image-based spectral detection, false alarms caused by outliers.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126439384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707415
M. D. Pritt
There is an increasing demand for software that automatically detects and classifies mobile targets such as airplanes, cars, and ships in satellite imagery. Applications of such automated target recognition (ATR) software include economic forecasting, traffic planning, maritime law enforcement, and disaster response. This paper describes the extension of a convolutional neural network (CNN) for classification to a sliding window algorithm for detection. It is evaluated on mobile targets of the xView dataset, on which it achieves detection and classification accuracies higher than 95%.
{"title":"Deep Learning for Recognizing Mobile Targets in Satellite Imagery","authors":"M. D. Pritt","doi":"10.1109/AIPR.2018.8707415","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707415","url":null,"abstract":"There is an increasing demand for software that automatically detects and classifies mobile targets such as airplanes, cars, and ships in satellite imagery. Applications of such automated target recognition (ATR) software include economic forecasting, traffic planning, maritime law enforcement, and disaster response. This paper describes the extension of a convolutional neural network (CNN) for classification to a sliding window algorithm for detection. It is evaluated on mobile targets of the xView dataset, on which it achieves detection and classification accuracies higher than 95%.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127883715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707406
Tyler W. Nivin, G. Scott, J. A. Hurt, Raymond L. Chastain, C. Davis
The identification of nodal road network features in remote sensing imagery is an important object detection task due to its versatility of application. A successful capability enables urban sprawl tracking, automatic or semi-automated map accuracy validation and updating, and macro-scale infrastructure damage evaluation and tracking just to name a few. We have curated a custom, novel dataset that includes nodal road network features such as bridges, cul-de-sacs, freeway exchanges and exits, freeway overpasses, intersections, and traffic circles. From this curated data we have evaluated the use of deep machine learning for object recognition across two variations in this image dataset. These variations are expanded versus semantically coalesced classes. We have evaluated the performance of two deep convolutional neural networks, ResNet50 and Xception, to detect these features across these variations of the image datasets. We have also explored the use of class-specific data augmentation to improve the performance of the models trained for nodal road network feature detection. Cross-validation performance of the models evaluated on four variations of this nodal road network feature dataset range from 0.81 to 0.96 (F1 scores). Coalescing highly specific, semantically challenging classes into more semantically generalized classes has a significant impact on the accuracy of the models. Our analysis provides insight into if and how these techniques can improve the performance of machine learning models, facilitating application to broad area imagery analysis in numerous application domains.
{"title":"Exploring the Effects of Class-Specific Augmentation and Class Coalescence on Deep Neural Network Performance Using a Novel Road Feature Dataset","authors":"Tyler W. Nivin, G. Scott, J. A. Hurt, Raymond L. Chastain, C. Davis","doi":"10.1109/AIPR.2018.8707406","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707406","url":null,"abstract":"The identification of nodal road network features in remote sensing imagery is an important object detection task due to its versatility of application. A successful capability enables urban sprawl tracking, automatic or semi-automated map accuracy validation and updating, and macro-scale infrastructure damage evaluation and tracking just to name a few. We have curated a custom, novel dataset that includes nodal road network features such as bridges, cul-de-sacs, freeway exchanges and exits, freeway overpasses, intersections, and traffic circles. From this curated data we have evaluated the use of deep machine learning for object recognition across two variations in this image dataset. These variations are expanded versus semantically coalesced classes. We have evaluated the performance of two deep convolutional neural networks, ResNet50 and Xception, to detect these features across these variations of the image datasets. We have also explored the use of class-specific data augmentation to improve the performance of the models trained for nodal road network feature detection. Cross-validation performance of the models evaluated on four variations of this nodal road network feature dataset range from 0.81 to 0.96 (F1 scores). Coalescing highly specific, semantically challenging classes into more semantically generalized classes has a significant impact on the accuracy of the models. Our analysis provides insight into if and how these techniques can improve the performance of machine learning models, facilitating application to broad area imagery analysis in numerous application domains.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134254228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707427
J. Straub
Pattern recognition and image analysis find application in numerous areas. They support law enforcement, transportation security and warfighting. They also have numerous commercial purposes, ranging from agriculture to manufacturing to facility security. This paper discusses the security needs of image recognition systems and the particular concerns that the algorithms and methods used for these activities pose. It also discusses how to secure these analysis systems and the future work required in these areas.
{"title":"Cybersecurity Considerations for Image Pattern Recognition Applications","authors":"J. Straub","doi":"10.1109/AIPR.2018.8707427","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707427","url":null,"abstract":"Pattern recognition and image analysis find application in numerous areas. They support law enforcement, transportation security and warfighting. They also have numerous commercial purposes, ranging from agriculture to manufacturing to facility security. This paper discusses the security needs of image recognition systems and the particular concerns that the algorithms and methods used for these activities pose. It also discusses how to secure these analysis systems and the future work required in these areas.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"37 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121002923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707400
Samsara N. Counts, J. Manning, Robert Pless
Eating disorders are often exacerbated by exposure to triggering images on social media. Standard approaches to filtering of social media by detecting hashtags or keywords are difficult to keep accurate because those migrate or change over time. In this work we present proof-of-concept demonstrations to show that Deep Learning classification algorithms are effective at classifying images related to eating disorders. We discuss some of the challenges in this domain and show that careful curation of the training data improves performance substantially.
{"title":"Characterizing the Visual Social Media Environment of Eating Disorders","authors":"Samsara N. Counts, J. Manning, Robert Pless","doi":"10.1109/AIPR.2018.8707400","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707400","url":null,"abstract":"Eating disorders are often exacerbated by exposure to triggering images on social media. Standard approaches to filtering of social media by detecting hashtags or keywords are difficult to keep accurate because those migrate or change over time. In this work we present proof-of-concept demonstrations to show that Deep Learning classification algorithms are effective at classifying images related to eating disorders. We discuss some of the challenges in this domain and show that careful curation of the training data improves performance substantially.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117235738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707416
Hua-mei Chen, J. Irvine, Zhonghai Wang, Genshe Chen, Erik Blasch, James Nagy
Analysis of thermal Infrared (IR) imagery is critical to many law enforcement and military missions, particularly for operations at night or in low-light conditions. Transmitting the imagery data from the sensor to the operator often relies on limited bandwidth channels, leading to information loss. This paper develops a method, known as the Compression Degradation Image Function Index (CoDIFI) framework, that predicts the degradation in interpretability associated with the specific image compression method and level of compression. Quantification of the image interpretability relies on the National Imagery Interpretability Ratings Scale (NIIRS). Building on previously reported development and validation of CoDIFI operating on electro-optical (EO) imagery collected in the visible region, this paper extends CoDIFI to imagery collected in the mid-wave infrared (MWIR) region, approximately 3 to 5 microns. For the infrared imagery application, the IR NIIRS is the standard for quantifying image interpretability and the prediction model rests on the general image quality equation (GIQE). A prediction model using the CoDIFI for IR imagery is established with empirical validation. By leveraging the CoDIFI in operational settings, mission success ensures that the compression selection is achievable in terms of the NIIRS level of imagery data delivered to users, while optimizing the use of scarce data transmission capacity.
{"title":"Predicting Interpretability Loss in Thermal IR Imagery due to Compression","authors":"Hua-mei Chen, J. Irvine, Zhonghai Wang, Genshe Chen, Erik Blasch, James Nagy","doi":"10.1109/AIPR.2018.8707416","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707416","url":null,"abstract":"Analysis of thermal Infrared (IR) imagery is critical to many law enforcement and military missions, particularly for operations at night or in low-light conditions. Transmitting the imagery data from the sensor to the operator often relies on limited bandwidth channels, leading to information loss. This paper develops a method, known as the Compression Degradation Image Function Index (CoDIFI) framework, that predicts the degradation in interpretability associated with the specific image compression method and level of compression. Quantification of the image interpretability relies on the National Imagery Interpretability Ratings Scale (NIIRS). Building on previously reported development and validation of CoDIFI operating on electro-optical (EO) imagery collected in the visible region, this paper extends CoDIFI to imagery collected in the mid-wave infrared (MWIR) region, approximately 3 to 5 microns. For the infrared imagery application, the IR NIIRS is the standard for quantifying image interpretability and the prediction model rests on the general image quality equation (GIQE). A prediction model using the CoDIFI for IR imagery is established with empirical validation. By leveraging the CoDIFI in operational settings, mission success ensures that the compression selection is achievable in terms of the NIIRS level of imagery data delivered to users, while optimizing the use of scarce data transmission capacity.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121059695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707422
J. Irvine, L. Mariano, Teal Guidici
Target tracking derived from motion imagery provides a capability to detect, recognize, and analyze activities in a manner not possible with still images. Target tracking enables automated activity analysis. In this paper, we develop methods for automatically exploiting the tracking data derived from motion imagery, or other tracking data, to detect and recognize activities, develop models of normal behavior, and detect departure from normalcy. The critical steps in our approach are to construct a syntactic representation of the track behaviors and map this representation to a small set of learned activities. We have developed methods for representing activities through syntactic analysis of the track data, by "tokenizing" the track, i.e. converting the kinematic information into strings of symbols amenable to further analysis. The syntactic analysis of target tracks is the foundation for constructing an expandable dictionary of activities. Through unsupervised learning on the tokenized track data we discovery the common activities. The probability distribution of these learned activities is the "dictionary". Newly acquired track data is compared to the dictionary to flag atypical behaviors as departures from normalcy. We demonstrate the methods with two relevant data sets: the Porto taxi data and a set of video data acquired at Draper. These data sets illustrate the flexibility and power of these methods for activity analysis.
{"title":"Normalcy Modeling Using a Dictionary of Activities Learned from Motion Imagery Tracking Data","authors":"J. Irvine, L. Mariano, Teal Guidici","doi":"10.1109/AIPR.2018.8707422","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707422","url":null,"abstract":"Target tracking derived from motion imagery provides a capability to detect, recognize, and analyze activities in a manner not possible with still images. Target tracking enables automated activity analysis. In this paper, we develop methods for automatically exploiting the tracking data derived from motion imagery, or other tracking data, to detect and recognize activities, develop models of normal behavior, and detect departure from normalcy. The critical steps in our approach are to construct a syntactic representation of the track behaviors and map this representation to a small set of learned activities. We have developed methods for representing activities through syntactic analysis of the track data, by \"tokenizing\" the track, i.e. converting the kinematic information into strings of symbols amenable to further analysis. The syntactic analysis of target tracks is the foundation for constructing an expandable dictionary of activities. Through unsupervised learning on the tokenized track data we discovery the common activities. The probability distribution of these learned activities is the \"dictionary\". Newly acquired track data is compared to the dictionary to flag atypical behaviors as departures from normalcy. We demonstrate the methods with two relevant data sets: the Porto taxi data and a set of video data acquired at Draper. These data sets illustrate the flexibility and power of these methods for activity analysis.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116796231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/AIPR.2018.8707433
J. A. Hurt, G. Scott, Derek T. Anderson, C. Davis
Recent years have seen the publication of various high-resolution remote sensing imagery benchmark datasets. These datasets, while diverse in design, have many co-occurring object classes that are of interest for various application domains of Earth observation. In this research, we present our evaluation of a new meta-benchmark dataset combining object classes from the UC Merced, WHU-RS19, PatternNet, and RESISC-45 benchmark datasets. We provide open-source resources to acquire the individual benchmark datasets and then agglomerate them into a new meta-dataset (MDS). Prior research has shown that contemporary deep convolutional neural networks are able to achieve cross-validation accuracies in the range of 95-100% for the 33 identified object classes. Our analysis shows that the overall accuracy for all object classes from these benchmarks is approximately 98.6%. In this work, we investigate the utility of agglomerating the benchmarks into an MDS to train more generalizable, and therefore translatable from lab to real-world, deep machine learning (DML) models. We evaluate numerous state-of-the-art architectures, as well as our data-driven DML model fusion techniques. Finally, we compare MDS performance with that of the benchmark datasets to evaluate the performance versus cost trade-off of using multiple DML in an ensemble system.
{"title":"Benchmark Meta-Dataset of High-Resolution Remote Sensing Imagery for Training Robust Deep Learning Models in Machine-Assisted Visual Analytics","authors":"J. A. Hurt, G. Scott, Derek T. Anderson, C. Davis","doi":"10.1109/AIPR.2018.8707433","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707433","url":null,"abstract":"Recent years have seen the publication of various high-resolution remote sensing imagery benchmark datasets. These datasets, while diverse in design, have many co-occurring object classes that are of interest for various application domains of Earth observation. In this research, we present our evaluation of a new meta-benchmark dataset combining object classes from the UC Merced, WHU-RS19, PatternNet, and RESISC-45 benchmark datasets. We provide open-source resources to acquire the individual benchmark datasets and then agglomerate them into a new meta-dataset (MDS). Prior research has shown that contemporary deep convolutional neural networks are able to achieve cross-validation accuracies in the range of 95-100% for the 33 identified object classes. Our analysis shows that the overall accuracy for all object classes from these benchmarks is approximately 98.6%. In this work, we investigate the utility of agglomerating the benchmarks into an MDS to train more generalizable, and therefore translatable from lab to real-world, deep machine learning (DML) models. We evaluate numerous state-of-the-art architectures, as well as our data-driven DML model fusion techniques. Finally, we compare MDS performance with that of the benchmark datasets to evaluate the performance versus cost trade-off of using multiple DML in an ensemble system.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131282838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}