Chenjun Gao, Ganghui Bian, Yanzhi Dong, Xiaohu Yuan, Huaping Liu
{"title":"Infrared Image Captioning Based on Unsupervised Learning and Reinforcement Learning","authors":"Chenjun Gao, Ganghui Bian, Yanzhi Dong, Xiaohu Yuan, Huaping Liu","doi":"10.1109/ICARCE55724.2022.10046598","DOIUrl":null,"url":null,"abstract":"When sufficient prior knowledge is lacking or manual annotation is difficult, solving the problem directly based on training samples of unknown category can greatly reduce the time cost. Therefore, we add unsupervised learning to the preliminary groundwork of image captioning for efficient image domain conversion to achieve batch generation of the required images. At the same time, more and more infrared images are being applied to assist decision making and environment perception. Generating more diverse and discriminative image captions in similar scenes will be effective in enhancing decision making and perception capabilities. Our infrared image caption model trained with reinforcement learning has satisfactory results both in terms of quantitative scores and in real scene tests.","PeriodicalId":416305,"journal":{"name":"2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE)","volume":"AES-12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARCE55724.2022.10046598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
When sufficient prior knowledge is lacking or manual annotation is difficult, solving the problem directly based on training samples of unknown category can greatly reduce the time cost. Therefore, we add unsupervised learning to the preliminary groundwork of image captioning for efficient image domain conversion to achieve batch generation of the required images. At the same time, more and more infrared images are being applied to assist decision making and environment perception. Generating more diverse and discriminative image captions in similar scenes will be effective in enhancing decision making and perception capabilities. Our infrared image caption model trained with reinforcement learning has satisfactory results both in terms of quantitative scores and in real scene tests.