{"title":"高分辨率遥感图像的深度语义理解","authors":"Bo Qu, Xuelong Li, D. Tao, Xiaoqiang Lu","doi":"10.1109/CITS.2016.7546397","DOIUrl":null,"url":null,"abstract":"With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.","PeriodicalId":340958,"journal":{"name":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"119","resultStr":"{\"title\":\"Deep semantic understanding of high resolution remote sensing image\",\"authors\":\"Bo Qu, Xuelong Li, D. Tao, Xiaoqiang Lu\",\"doi\":\"10.1109/CITS.2016.7546397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.\",\"PeriodicalId\":340958,\"journal\":{\"name\":\"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"119\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CITS.2016.7546397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITS.2016.7546397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep semantic understanding of high resolution remote sensing image
With the rapid development of remote sensing technology, huge quantities of high resolution remote sensing images are available now. Understanding these images in semantic level is of great significance. Hence, a deep multimodal neural network model for semantic understanding of the high resolution remote sensing images is proposed in this paper, which uses both visual and textual information of the high resolution remote sensing images to generate natural sentences describing the given images. In the proposed model, the convolution neural network is utilized to extract the image feature, which is then combined with the text descriptions of the images by RNN or LSTMs. And in the experiments, two new remote sensing image-captions datasets are built at first. Then different kinds of CNNs with RNN or LSTMs are combined to find which is the best combination for caption generation. The experiments results prove that the proposed method achieves good performances in semantic understanding of high resolution remote sensing images.