{"title":"An Efficient Image Captioning Method Based on Beam Search","authors":"Tarun Jaiswal, Manju Pandey, Priyanka Tripathi","doi":"10.2174/0123520965254606231009091711","DOIUrl":null,"url":null,"abstract":"Introduction: An image captioning system is a crucial component in the domains of computer vision and natural language processing. Deep neural networks have been an increasingly popular tool for the generation of descriptive captions for photos in recent years. Method: However, these models frequently have the issue of providing captions that are unoriginal and repetitious. Beam search is a well-known search technique that is utilized for the purpose of producing descriptions for images in an effective and productive manner. The algorithm keeps track of a set of partial captions and expands them iteratively by choosing the probable next word throughout each step until a complete caption is generated. The set of partial captions, also known as the beam, is updated at each step based on the predicted probabilities of the next words. This research paper presents an image caption generation system based on beam search. In order to encode the image data and generate captions, the system is trained on a deep neural network architecture. Results: This architecture brings together the benefits of CNN with RNN. After that, the beam search method is executed in order to provide the completed captions, resulting in a more diverse and descriptive set of captions compared to traditional greedy decoding approaches. The experimental outcomes indicate that the suggested system is superior to the existing image caption generation techniques in terms of the precision and variety of the generated captions. Conclusion: This demonstrates the effectiveness of beam search in enhancing the efficiency of image caption generation systems.","PeriodicalId":43275,"journal":{"name":"Recent Advances in Electrical & Electronic Engineering","volume":"55 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Electrical & Electronic Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0123520965254606231009091711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: An image captioning system is a crucial component in the domains of computer vision and natural language processing. Deep neural networks have been an increasingly popular tool for the generation of descriptive captions for photos in recent years. Method: However, these models frequently have the issue of providing captions that are unoriginal and repetitious. Beam search is a well-known search technique that is utilized for the purpose of producing descriptions for images in an effective and productive manner. The algorithm keeps track of a set of partial captions and expands them iteratively by choosing the probable next word throughout each step until a complete caption is generated. The set of partial captions, also known as the beam, is updated at each step based on the predicted probabilities of the next words. This research paper presents an image caption generation system based on beam search. In order to encode the image data and generate captions, the system is trained on a deep neural network architecture. Results: This architecture brings together the benefits of CNN with RNN. After that, the beam search method is executed in order to provide the completed captions, resulting in a more diverse and descriptive set of captions compared to traditional greedy decoding approaches. The experimental outcomes indicate that the suggested system is superior to the existing image caption generation techniques in terms of the precision and variety of the generated captions. Conclusion: This demonstrates the effectiveness of beam search in enhancing the efficiency of image caption generation systems.
期刊介绍:
Recent Advances in Electrical & Electronic Engineering publishes full-length/mini reviews and research articles, guest edited thematic issues on electrical and electronic engineering and applications. The journal also covers research in fast emerging applications of electrical power supply, electrical systems, power transmission, electromagnetism, motor control process and technologies involved and related to electrical and electronic engineering. The journal is essential reading for all researchers in electrical and electronic engineering science.