Liyana Sahir Kallooriyakath, J. V., B. V, Adith P P
{"title":"视觉问答:方法和挑战","authors":"Liyana Sahir Kallooriyakath, J. V., B. V, Adith P P","doi":"10.1109/ICSTCEE49637.2020.9277374","DOIUrl":null,"url":null,"abstract":"Given an image and a question in natural language based on the contents of the image, a Visual Question Answering system should model and produce an answer in natural language inferred from the information within the image. Visual question answering is a problem with increasing significance in the field of Artificial Intelligence as it lies in the crucial intersection between computer vision and natural language processing. Various methodologies have been proposed for obtaining a natural language answer to a user inputted question based on a given image. The purpose of this paper is to review various contemporary techniques for visual question answering. The advantages and limitations of these approaches are compared in this review. In addition, the areas for improvement within these approaches are discussed in this paper.","PeriodicalId":113845,"journal":{"name":"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Visual Question Answering: Methodologies and Challenges\",\"authors\":\"Liyana Sahir Kallooriyakath, J. V., B. V, Adith P P\",\"doi\":\"10.1109/ICSTCEE49637.2020.9277374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given an image and a question in natural language based on the contents of the image, a Visual Question Answering system should model and produce an answer in natural language inferred from the information within the image. Visual question answering is a problem with increasing significance in the field of Artificial Intelligence as it lies in the crucial intersection between computer vision and natural language processing. Various methodologies have been proposed for obtaining a natural language answer to a user inputted question based on a given image. The purpose of this paper is to review various contemporary techniques for visual question answering. The advantages and limitations of these approaches are compared in this review. In addition, the areas for improvement within these approaches are discussed in this paper.\",\"PeriodicalId\":113845,\"journal\":{\"name\":\"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)\",\"volume\":\"172 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSTCEE49637.2020.9277374\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSTCEE49637.2020.9277374","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Visual Question Answering: Methodologies and Challenges
Given an image and a question in natural language based on the contents of the image, a Visual Question Answering system should model and produce an answer in natural language inferred from the information within the image. Visual question answering is a problem with increasing significance in the field of Artificial Intelligence as it lies in the crucial intersection between computer vision and natural language processing. Various methodologies have been proposed for obtaining a natural language answer to a user inputted question based on a given image. The purpose of this paper is to review various contemporary techniques for visual question answering. The advantages and limitations of these approaches are compared in this review. In addition, the areas for improvement within these approaches are discussed in this paper.