{"title":"生成图像中物体间空间关系的描述","authors":"A. Muscat, A. Belz","doi":"10.18653/v1/W15-4717","DOIUrl":null,"url":null,"abstract":"We investigate the task of predicting prepositions that can be used to describe the spatial relationships between pairs of objects depicted in images. We explore the extent to which such spatial prepositions can be predicted from (a) language information, (b) visual information, and (c) combinations of the two. In this paper we describe the dataset of object pairs and prepositions we have created, and report first results for predicting prepositions for object pairs, using a Naive Bayes framework. The features we use include object class labels and geometrical features computed from object bounding boxes. We evaluate the results in terms of accuracy against human-selected prepositions.","PeriodicalId":307841,"journal":{"name":"European Workshop on Natural Language Generation","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Generating Descriptions of Spatial Relations between Objects in Images\",\"authors\":\"A. Muscat, A. Belz\",\"doi\":\"10.18653/v1/W15-4717\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We investigate the task of predicting prepositions that can be used to describe the spatial relationships between pairs of objects depicted in images. We explore the extent to which such spatial prepositions can be predicted from (a) language information, (b) visual information, and (c) combinations of the two. In this paper we describe the dataset of object pairs and prepositions we have created, and report first results for predicting prepositions for object pairs, using a Naive Bayes framework. The features we use include object class labels and geometrical features computed from object bounding boxes. We evaluate the results in terms of accuracy against human-selected prepositions.\",\"PeriodicalId\":307841,\"journal\":{\"name\":\"European Workshop on Natural Language Generation\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Workshop on Natural Language Generation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W15-4717\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Workshop on Natural Language Generation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W15-4717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generating Descriptions of Spatial Relations between Objects in Images
We investigate the task of predicting prepositions that can be used to describe the spatial relationships between pairs of objects depicted in images. We explore the extent to which such spatial prepositions can be predicted from (a) language information, (b) visual information, and (c) combinations of the two. In this paper we describe the dataset of object pairs and prepositions we have created, and report first results for predicting prepositions for object pairs, using a Naive Bayes framework. The features we use include object class labels and geometrical features computed from object bounding boxes. We evaluate the results in terms of accuracy against human-selected prepositions.