Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363396
Sunpreet Sharma, J. Zou, G. Fang
A novel non-blind watermarking technique for identity protection is presented. The proposed watermarking scheme uses the owner's signature as the watermark through which the ownership and validity of the document can be proven and kept intact. The proposed scheme is robust, imperceptible and faster in comparison to the other state of the art methods. Experimental simulations and evaluations of the proposed method show excellent results from both objective and subjective view points.
{"title":"A Novel Signature Watermarking Scheme for Identity Protection","authors":"Sunpreet Sharma, J. Zou, G. Fang","doi":"10.1109/DICTA51227.2020.9363396","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363396","url":null,"abstract":"A novel non-blind watermarking technique for identity protection is presented. The proposed watermarking scheme uses the owner's signature as the watermark through which the ownership and validity of the document can be proven and kept intact. The proposed scheme is robust, imperceptible and faster in comparison to the other state of the art methods. Experimental simulations and evaluations of the proposed method show excellent results from both objective and subjective view points.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125386429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363372
Zhou Shen, Chuong H. Nguyen
Automatic detection and tracking of fish provides valuable information for marine life science. Deep convolutional networks have been applied with some success but performance is affected by challenging imaging conditions including complex background, variation of light and the low visibility of the underwater environment. Existing works including Fast R-CNN and RetinaNet rely on single frame fish detection and suffer noisy and unreliable detections. In this paper, we propose and examine two 3D deep learning networks using temporal features to improve fish detection performance. The first one called 3D-backbone RetinaNet based 3D ResNet for temporal information is found worse than 2D RetinaNet. The second one called 3D-subnets RetinaNet based on 3D Regression subnet and Classification subnet to extract the temporal information is found better than 2D RetinaNet. To validating the performance of these networks, we also created a new fish data set which will be made publicly available with codes of the proposed networks.
{"title":"Temporal 3D RetinaNet for fish detection","authors":"Zhou Shen, Chuong H. Nguyen","doi":"10.1109/DICTA51227.2020.9363372","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363372","url":null,"abstract":"Automatic detection and tracking of fish provides valuable information for marine life science. Deep convolutional networks have been applied with some success but performance is affected by challenging imaging conditions including complex background, variation of light and the low visibility of the underwater environment. Existing works including Fast R-CNN and RetinaNet rely on single frame fish detection and suffer noisy and unreliable detections. In this paper, we propose and examine two 3D deep learning networks using temporal features to improve fish detection performance. The first one called 3D-backbone RetinaNet based 3D ResNet for temporal information is found worse than 2D RetinaNet. The second one called 3D-subnets RetinaNet based on 3D Regression subnet and Classification subnet to extract the temporal information is found better than 2D RetinaNet. To validating the performance of these networks, we also created a new fish data set which will be made publicly available with codes of the proposed networks.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116701569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363368
Maria Oliver-Parera, Julien Muzeau, P. Ladret, P. Bertolino
Moving Object Detection (MOD) is still an active area of research due to the amount of scenarios it can tackle and the different characteristics that may appear in them. Therefore, getting a unique method that performs well in all the situations becomes a challenging task. In this paper we address the MOD problem from a physical point of view: given the optical flow between two images, we propose to find its motion-boundaries by means of the optical strain, which gives information about the deformation of any vector field. As optical strain detects all the motions from a sequence, we propose to work on temporal windows and apply thresholding on them in order to separate noise from real motion. The proposed approach shows competitive results when compared to other methods on known datasets.
{"title":"Contour Detection of Multiple Moving Objects in Unconstrained Scenes using Optical Strain","authors":"Maria Oliver-Parera, Julien Muzeau, P. Ladret, P. Bertolino","doi":"10.1109/DICTA51227.2020.9363368","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363368","url":null,"abstract":"Moving Object Detection (MOD) is still an active area of research due to the amount of scenarios it can tackle and the different characteristics that may appear in them. Therefore, getting a unique method that performs well in all the situations becomes a challenging task. In this paper we address the MOD problem from a physical point of view: given the optical flow between two images, we propose to find its motion-boundaries by means of the optical strain, which gives information about the deformation of any vector field. As optical strain detects all the motions from a sequence, we propose to work on temporal windows and apply thresholding on them in order to separate noise from real motion. The proposed approach shows competitive results when compared to other methods on known datasets.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117036069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363425
T. Hassanzadeh, D. Essam, R. Sarker
Medical image segmentation is an active research topic to analyse medical images to find an organ or possible abnormalities in an image. Using a Convolutional Neural Network (CNN) is a successful technique for medical image segmentation. However, developing a CNN is a difficult task, especially when it includes complex structures, such as an attention mechanism. A CNN equipped with an attention mechanism is able to focus on a specific part of an image to extract a Region Of Interest (ROI), that can play a significant role to increase the accuracy of an image segmentation. Due to the difficulty of developing an attention network, in this paper, we introduce a new evolutionary technique to generate an attention network automatically for medical image segmentation. To the best of our knowledge, this is the first attempt to create an attention network using an evolutionary technique. To do this, a new encoding model is introduced to create a network topology, along with its training parameters, to ease the complexity of developing a CNN. Also, a Genetic Algorithm (GA) is applied to evolve the networks. To show the capability of the proposed technique, we used three publicly available medical segmentation datasets. The obtained results show that the proposed model can generate networks corresponding to each dataset, such that the developed networks have high performance for medical image segmentation.
{"title":"Evolutionary Attention Network for Medical Image Segmentation","authors":"T. Hassanzadeh, D. Essam, R. Sarker","doi":"10.1109/DICTA51227.2020.9363425","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363425","url":null,"abstract":"Medical image segmentation is an active research topic to analyse medical images to find an organ or possible abnormalities in an image. Using a Convolutional Neural Network (CNN) is a successful technique for medical image segmentation. However, developing a CNN is a difficult task, especially when it includes complex structures, such as an attention mechanism. A CNN equipped with an attention mechanism is able to focus on a specific part of an image to extract a Region Of Interest (ROI), that can play a significant role to increase the accuracy of an image segmentation. Due to the difficulty of developing an attention network, in this paper, we introduce a new evolutionary technique to generate an attention network automatically for medical image segmentation. To the best of our knowledge, this is the first attempt to create an attention network using an evolutionary technique. To do this, a new encoding model is introduced to create a network topology, along with its training parameters, to ease the complexity of developing a CNN. Also, a Genetic Algorithm (GA) is applied to evolve the networks. To show the capability of the proposed technique, we used three publicly available medical segmentation datasets. The obtained results show that the proposed model can generate networks corresponding to each dataset, such that the developed networks have high performance for medical image segmentation.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132621572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363392
Naeha Sharif, Lyndon White, Bennamoun, Wei Liu, Syed Afaq Ali Shah
The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings (MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work on an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.
{"title":"WEmbSim: A Simple yet Effective Metric for Image Captioning","authors":"Naeha Sharif, Lyndon White, Bennamoun, Wei Liu, Syed Afaq Ali Shah","doi":"10.1109/DICTA51227.2020.9363392","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363392","url":null,"abstract":"The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings (MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work on an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"74 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134162856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363347
Tanya Boone-Sifuentes, A. Robles-Kelly, A. Nazari
In this paper, we present a method for convolutional neural network model compression which is based on the removal of filter banks that correspond to unimportant weights. To do this, we depart from the relationship between consecutive layers so as to obtain a factor that can be used to assess the degree upon which each pair of filters are coupled to each other. This allows us to use the unit-response of the coupling between two layers so as to remove pathways int he network that are negligible. Moreover, since the back-propagation gradients tend to diminish as the chain rule is applied from the output to the input layer, here we maximise the variance on the coupling factors while enforcing a monotonicity constraint that assures the most relevant pathways are preserved. We show results on widely used networks employing classification and facial expression recognition datasets. In our experiments, our approach delivers a very competitive trade-off between compression rates and performance as compared to both, the uncompressed models and alternatives elsewhere in the literature. pages = 271–279
{"title":"Max-Variance Convolutional Neural Network Model Compression","authors":"Tanya Boone-Sifuentes, A. Robles-Kelly, A. Nazari","doi":"10.1109/DICTA51227.2020.9363347","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363347","url":null,"abstract":"In this paper, we present a method for convolutional neural network model compression which is based on the removal of filter banks that correspond to unimportant weights. To do this, we depart from the relationship between consecutive layers so as to obtain a factor that can be used to assess the degree upon which each pair of filters are coupled to each other. This allows us to use the unit-response of the coupling between two layers so as to remove pathways int he network that are negligible. Moreover, since the back-propagation gradients tend to diminish as the chain rule is applied from the output to the input layer, here we maximise the variance on the coupling factors while enforcing a monotonicity constraint that assures the most relevant pathways are preserved. We show results on widely used networks employing classification and facial expression recognition datasets. In our experiments, our approach delivers a very competitive trade-off between compression rates and performance as compared to both, the uncompressed models and alternatives elsewhere in the literature. pages = 271–279","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134084778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363382
M. Hossain, L. Rupty, Koushik Roy, Mohammed Hasan, Shirshajit Sengupta, Nabeel Mohammed
Face Anti Spoofing (FAS) systems are used to identify malicious spoofing attempts targeting face recognition systems using mediums such as video replay or printed papers. With increasing adoption of face recognition technology as a biometric authentication method, FAS techniques are gaining in importance. From a learning perspective, such systems pose a binary classification task. When implemented with Neural Network based solutions, it is common to use the binary cross entropy (BCE) function as the loss to optimize. In this study, we propose a variant of BCE that enforces a margin in angular space and incorporate it in training the DeepPixBis model [1]. In addition, we also present a method to incorporate such a loss for attentive pixel wise supervision applicable in a fully convolutional setting. Our proposed approach achieves competitive scores in both intra and inter-dataset testing on multiple benchmark datasets, consistently outperforming vanilla DeepPixBis. Interestingly, in the case of Protocol 4 of OULU-NPU, considered to be the hardest protocol, our proposed method achieves 5.22% ACER, which is only 0.22% higher than the current State of the Art without requiring any expensive Neural Architecture Search.
{"title":"A-DeepPixBis: Attentional Angular Margin for Face Anti-Spoofing","authors":"M. Hossain, L. Rupty, Koushik Roy, Mohammed Hasan, Shirshajit Sengupta, Nabeel Mohammed","doi":"10.1109/DICTA51227.2020.9363382","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363382","url":null,"abstract":"Face Anti Spoofing (FAS) systems are used to identify malicious spoofing attempts targeting face recognition systems using mediums such as video replay or printed papers. With increasing adoption of face recognition technology as a biometric authentication method, FAS techniques are gaining in importance. From a learning perspective, such systems pose a binary classification task. When implemented with Neural Network based solutions, it is common to use the binary cross entropy (BCE) function as the loss to optimize. In this study, we propose a variant of BCE that enforces a margin in angular space and incorporate it in training the DeepPixBis model [1]. In addition, we also present a method to incorporate such a loss for attentive pixel wise supervision applicable in a fully convolutional setting. Our proposed approach achieves competitive scores in both intra and inter-dataset testing on multiple benchmark datasets, consistently outperforming vanilla DeepPixBis. Interestingly, in the case of Protocol 4 of OULU-NPU, considered to be the hardest protocol, our proposed method achieves 5.22% ACER, which is only 0.22% higher than the current State of the Art without requiring any expensive Neural Architecture Search.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131347232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363405
Rafael Felix, M. Sasdelli, Ben Harwood, G. Carneiro
Generalised zero-shot learning (GZSL) methods aim to classify previously seen and unseen visual classes by leveraging the semantic information of those classes. In the context of GZSL, semantic information is non-visual data such as a text description of the seen and unseen classes. Previous GZSL methods have explored transformations between visual and semantic spaces, as well as the learning of a latent joint visual and semantic space. In these methods, even though learning has explored a combination of spaces (i.e., visual, semantic or joint latent space), inference tended to focus on using just one of the spaces. By hypothesising that inference must explore all three spaces, we propose a new GZSL method based on a multimodal classification over visual, semantic and joint latent spaces. Another issue affecting current GZSL methods is the intrinsic bias toward the classification of seen classes - a problem that is usually mitigated by a domain classifier which modulates seen and unseen classification. Our proposed approach replaces the modulated classification by a computationally simpler multidomain classification based on averaging the multi-modal calibrated classifiers from the seen and unseen domains. Experiments on GZSL benchmarks show that our proposed GZSL approach achieves competitive results compared with the state-of-the-art.
{"title":"Generalised Zero-shot Learning with Multi-modal Embedding Spaces","authors":"Rafael Felix, M. Sasdelli, Ben Harwood, G. Carneiro","doi":"10.1109/DICTA51227.2020.9363405","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363405","url":null,"abstract":"Generalised zero-shot learning (GZSL) methods aim to classify previously seen and unseen visual classes by leveraging the semantic information of those classes. In the context of GZSL, semantic information is non-visual data such as a text description of the seen and unseen classes. Previous GZSL methods have explored transformations between visual and semantic spaces, as well as the learning of a latent joint visual and semantic space. In these methods, even though learning has explored a combination of spaces (i.e., visual, semantic or joint latent space), inference tended to focus on using just one of the spaces. By hypothesising that inference must explore all three spaces, we propose a new GZSL method based on a multimodal classification over visual, semantic and joint latent spaces. Another issue affecting current GZSL methods is the intrinsic bias toward the classification of seen classes - a problem that is usually mitigated by a domain classifier which modulates seen and unseen classification. Our proposed approach replaces the modulated classification by a computationally simpler multidomain classification based on averaging the multi-modal calibrated classifiers from the seen and unseen domains. Experiments on GZSL benchmarks show that our proposed GZSL approach achieves competitive results compared with the state-of-the-art.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124089178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363410
Christian Bartz, Laurenz Seidel, Duy-Hung Nguyen, Joseph Bethge, Haojin Yang, C. Meinel
Archives contain a wealth of information and are invaluable for historical research. Thanks to digitization, many archives are preserved in a digital format, making it easier to share and access documents from an archive. Handwriting and handwritten notes are commonly found in archives and contain a lot of information that can not be extracted by analyzing documents with Optical Character Recognition (OCR) for printed text. In this paper, we present an approach for determining whether a scan of a document contains handwriting. As a preprocessing step, this approach can help to identify documents that need further analysis with a full recognition pipeline. Our method consists of a deep neural network that classifies whether a document contains handwriting. Our method is designed in such a way that we overcome the most significant challenge when working with archival data, which is the scarcity of annotated training data. To overcome this problem, we introduce a data generation method to successfully train our proposed deep neural network. Our experiments show that our model, trained on synthetic data, can achieve promising results on a real-world dataset from an art-historical archive.
{"title":"Synthetic Data for the Analysis of Archival Documents: Handwriting Determination","authors":"Christian Bartz, Laurenz Seidel, Duy-Hung Nguyen, Joseph Bethge, Haojin Yang, C. Meinel","doi":"10.1109/DICTA51227.2020.9363410","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363410","url":null,"abstract":"Archives contain a wealth of information and are invaluable for historical research. Thanks to digitization, many archives are preserved in a digital format, making it easier to share and access documents from an archive. Handwriting and handwritten notes are commonly found in archives and contain a lot of information that can not be extracted by analyzing documents with Optical Character Recognition (OCR) for printed text. In this paper, we present an approach for determining whether a scan of a document contains handwriting. As a preprocessing step, this approach can help to identify documents that need further analysis with a full recognition pipeline. Our method consists of a deep neural network that classifies whether a document contains handwriting. Our method is designed in such a way that we overcome the most significant challenge when working with archival data, which is the scarcity of annotated training data. To overcome this problem, we introduce a data generation method to successfully train our proposed deep neural network. Our experiments show that our model, trained on synthetic data, can achieve promising results on a real-world dataset from an art-historical archive.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-29DOI: 10.1109/DICTA51227.2020.9363427
C. McLaughlin, A. Woodley, S. Geva, Timothy Chappell, W. Kelly, W. Boles, Lance De Vine, Holly Hutson
Identifying changes on the Earth's surface is one of the most fundamental aspects of Earth observation from satellite images. Historically, the predominant form of analysis has measured change at a pixel level. Here, we present a new strategy that conducts the analysis based on objects. The objects are placed inside a random forest regressor. We have tested our approach in Queensland, Australia using Sentinel data. We find that the use of object-based approach either outperforms or is comparable to alternative approaches.
{"title":"Object Based Remote Sensing Using Sentinel Data","authors":"C. McLaughlin, A. Woodley, S. Geva, Timothy Chappell, W. Kelly, W. Boles, Lance De Vine, Holly Hutson","doi":"10.1109/DICTA51227.2020.9363427","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363427","url":null,"abstract":"Identifying changes on the Earth's surface is one of the most fundamental aspects of Earth observation from satellite images. Historically, the predominant form of analysis has measured change at a pixel level. Here, we present a new strategy that conducts the analysis based on objects. The objects are placed inside a random forest regressor. We have tested our approach in Queensland, Australia using Sentinel data. We find that the use of object-based approach either outperforms or is comparable to alternative approaches.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129153175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}