{"title":"Short-term memory with read-only unit in neural image caption generator","authors":"Aghasi Poghosyan, H. Sarukhanyan","doi":"10.1109/CSITECHNOL.2017.8312163","DOIUrl":null,"url":null,"abstract":"Automated caption generation for digital images is one of the fundamental problems in artificial intelligence. Most of the existing works use Long Short-Term Memory as a recurrent neural network cell to solve this task. After training, their deep neural models can generate an image caption. But there is an issue, the next predicted word of the caption depends mainly on the last predicted word, rather than the image content. In this paper we present model that can automatically generate an image description and is based on a recurrent neural network with modified LSTM cell with an additional gate responsible for image features. This modification results in generation of more accurate captions. We have trained and tested our model on MSCOCO image dataset by using only images and their captions.","PeriodicalId":332371,"journal":{"name":"2017 Computer Science and Information Technologies (CSIT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Computer Science and Information Technologies (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSITECHNOL.2017.8312163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Automated caption generation for digital images is one of the fundamental problems in artificial intelligence. Most of the existing works use Long Short-Term Memory as a recurrent neural network cell to solve this task. After training, their deep neural models can generate an image caption. But there is an issue, the next predicted word of the caption depends mainly on the last predicted word, rather than the image content. In this paper we present model that can automatically generate an image description and is based on a recurrent neural network with modified LSTM cell with an additional gate responsible for image features. This modification results in generation of more accurate captions. We have trained and tested our model on MSCOCO image dataset by using only images and their captions.