高分辨率遥感图像的深度语义理解

2016 International Conference on Computer, Information and Telecommunication Systems (CITS) Pub Date : 2016-07-06 DOI:10.1109/CITS.2016.7546397

Bo Qu, Xuelong Li, D. Tao, Xiaoqiang Lu

{"title":"高分辨率遥感图像的深度语义理解","authors":"Bo Qu, Xuelong Li, D. Tao, Xiaoqiang Lu","doi":"10.1109/CITS.2016.7546397","DOIUrl":null,"url":null,"abstract":"With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.","PeriodicalId":340958,"journal":{"name":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"119","resultStr":"{\"title\":\"Deep semantic understanding of high resolution remote sensing image\",\"authors\":\"Bo Qu, Xuelong Li, D. Tao, Xiaoqiang Lu\",\"doi\":\"10.1109/CITS.2016.7546397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.\",\"PeriodicalId\":340958,\"journal\":{\"name\":\"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"119\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CITS.2016.7546397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITS.2016.7546397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 119

摘要

随着遥感技术的飞速发展，现在有大量的高分辨率遥感图像可供使用。从语义层面理解这些图像具有重要意义。为此，本文提出了一种用于高分辨率遥感图像语义理解的深度多模态神经网络模型，该模型利用高分辨率遥感图像的视觉和文本信息生成描述给定图像的自然句子。在该模型中，利用卷积神经网络提取图像特征，然后通过RNN或lstm将其与图像的文本描述相结合。在实验中，首先建立了两个新的遥感图像字幕数据集。然后将不同类型的cnn与RNN或lstm相结合，找出哪种组合最适合标题生成。实验结果表明，该方法在高分辨率遥感图像的语义理解方面取得了较好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep semantic understanding of high resolution remote sensing image

With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 International Conference on Computer, Information and Telecommunication Systems (CITS)

自引率

0.00%

发文量