M. M. Alam, M. Rahman, M. Hosen, Khairul Anam Mubin, S. Hossen, M. F. Mridha
{"title":"基于Bahdanau注意的孟加拉语图像标题生成","authors":"M. M. Alam, M. Rahman, M. Hosen, Khairul Anam Mubin, S. Hossen, M. F. Mridha","doi":"10.1109/DASA54658.2022.9765268","DOIUrl":null,"url":null,"abstract":"In the past few years, many works are done in object detection using images and machine translation. Inspired by those works we introduced Bahdanau Attention Based Bengali Image Caption Generation (BABBICG) that generate automatically bangla caption based on images. The Conventional encoder-decoder architectures performance curse will reduce by Bahdanau Attention and achieving momentous improvements over encoder-decoder architectures. In this work, we extract features from images using InceptionV3 neural network and generate caption using RNN decoder. We used Gated Recurrent Unit (GRU) approach as RNN. We evaluate the model using BanglaLekhaImageCaptions dataset from Mendeley Data that can help to generate bangla caption.","PeriodicalId":231066,"journal":{"name":"2022 International Conference on Decision Aid Sciences and Applications (DASA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bahdanau Attention Based Bengali Image Caption Generation\",\"authors\":\"M. M. Alam, M. Rahman, M. Hosen, Khairul Anam Mubin, S. Hossen, M. F. Mridha\",\"doi\":\"10.1109/DASA54658.2022.9765268\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past few years, many works are done in object detection using images and machine translation. Inspired by those works we introduced Bahdanau Attention Based Bengali Image Caption Generation (BABBICG) that generate automatically bangla caption based on images. The Conventional encoder-decoder architectures performance curse will reduce by Bahdanau Attention and achieving momentous improvements over encoder-decoder architectures. In this work, we extract features from images using InceptionV3 neural network and generate caption using RNN decoder. We used Gated Recurrent Unit (GRU) approach as RNN. We evaluate the model using BanglaLekhaImageCaptions dataset from Mendeley Data that can help to generate bangla caption.\",\"PeriodicalId\":231066,\"journal\":{\"name\":\"2022 International Conference on Decision Aid Sciences and Applications (DASA)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Decision Aid Sciences and Applications (DASA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DASA54658.2022.9765268\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Decision Aid Sciences and Applications (DASA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASA54658.2022.9765268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bahdanau Attention Based Bengali Image Caption Generation
In the past few years, many works are done in object detection using images and machine translation. Inspired by those works we introduced Bahdanau Attention Based Bengali Image Caption Generation (BABBICG) that generate automatically bangla caption based on images. The Conventional encoder-decoder architectures performance curse will reduce by Bahdanau Attention and achieving momentous improvements over encoder-decoder architectures. In this work, we extract features from images using InceptionV3 neural network and generate caption using RNN decoder. We used Gated Recurrent Unit (GRU) approach as RNN. We evaluate the model using BanglaLekhaImageCaptions dataset from Mendeley Data that can help to generate bangla caption.