{"title":"PBNet: Position-specific Text-to-image Generation by Boundary","authors":"Tian Tian, Li Liu, Huaxiang Zhang, Dongmei Liu","doi":"10.1145/3469877.3493594","DOIUrl":null,"url":null,"abstract":"Most existing methods focus on improving the clarity and semantic consistency of the image with a given text, but do not pay attention to the multiple control of generated image content, such as the position of the object in generated image. In this paper, we introduce a novel position-based generative network (PBNet) which can generate fine-grained images with the object at the specified location. PBNet combines iterative structure with generative adversarial network (GAN). A location information embedding module (LIEM) is proposed to combine the location information extracted from the boundary block image with the semantic information extracted from the text. In addition, a silhouette generation module (SGM) is proposed to train the generator to generate object based on location information. The experimental results on CUB dataset demonstrate that PBNet effectively controls the location of the object in the generated image.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469877.3493594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Most existing methods focus on improving the clarity and semantic consistency of the image with a given text, but do not pay attention to the multiple control of generated image content, such as the position of the object in generated image. In this paper, we introduce a novel position-based generative network (PBNet) which can generate fine-grained images with the object at the specified location. PBNet combines iterative structure with generative adversarial network (GAN). A location information embedding module (LIEM) is proposed to combine the location information extracted from the boundary block image with the semantic information extracted from the text. In addition, a silhouette generation module (SGM) is proposed to train the generator to generate object based on location information. The experimental results on CUB dataset demonstrate that PBNet effectively controls the location of the object in the generated image.