生成对抗网络及其在文本到图像合成中的应用综述

IF 1 4区 数学 Q1 MATHEMATICS Electronic Research Archive Pub Date : 2023-01-01 DOI:10.3934/era.2023362
Wu Zeng, Heng-liang Zhu, Chuan Lin, Zheng-ying Xiao
{"title":"生成对抗网络及其在文本到图像合成中的应用综述","authors":"Wu Zeng, Heng-liang Zhu, Chuan Lin, Zheng-ying Xiao","doi":"10.3934/era.2023362","DOIUrl":null,"url":null,"abstract":"<abstract><p>With the continuous development of science and technology (especially computational devices with powerful computing capabilities), the image generation technology based on deep learning has also made significant achievements. Most cross-modal technologies based on deep learning can generate information from text into images, which has become a hot topic of current research. Text-to-image (T2I) synthesis technology has applications in multiple fields of computer vision, such as image enhancement, artificial intelligence painting, games and virtual reality. The T2I generation technology using generative adversarial networks can generate more realistic and diverse images, but there are also some shortcomings and challenges, such as difficulty in generating complex backgrounds. This review will be introduced in the following order. First, we introduce the basic principles and architecture of basic and classic generative adversarial networks (GANs). Second, this review categorizes T2I synthesis methods into four main categories. There are methods based on semantic enhancement, methods based on progressive structure, methods based on attention and methods based on introducing additional signals. We have chosen some of the classic and latest T2I methods for introduction and explain their main advantages and shortcomings. Third, we explain the basic dataset and evaluation indicators in the T2I field. Finally, prospects for future research directions are discussed. This review provides a systematic introduction to the basic GAN method and the T2I method based on it, which can serve as a reference for researchers.</p></abstract>","PeriodicalId":48554,"journal":{"name":"Electronic Research Archive","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey of generative adversarial networks and their application in text-to-image synthesis\",\"authors\":\"Wu Zeng, Heng-liang Zhu, Chuan Lin, Zheng-ying Xiao\",\"doi\":\"10.3934/era.2023362\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<abstract><p>With the continuous development of science and technology (especially computational devices with powerful computing capabilities), the image generation technology based on deep learning has also made significant achievements. Most cross-modal technologies based on deep learning can generate information from text into images, which has become a hot topic of current research. Text-to-image (T2I) synthesis technology has applications in multiple fields of computer vision, such as image enhancement, artificial intelligence painting, games and virtual reality. The T2I generation technology using generative adversarial networks can generate more realistic and diverse images, but there are also some shortcomings and challenges, such as difficulty in generating complex backgrounds. This review will be introduced in the following order. First, we introduce the basic principles and architecture of basic and classic generative adversarial networks (GANs). Second, this review categorizes T2I synthesis methods into four main categories. There are methods based on semantic enhancement, methods based on progressive structure, methods based on attention and methods based on introducing additional signals. We have chosen some of the classic and latest T2I methods for introduction and explain their main advantages and shortcomings. Third, we explain the basic dataset and evaluation indicators in the T2I field. Finally, prospects for future research directions are discussed. This review provides a systematic introduction to the basic GAN method and the T2I method based on it, which can serve as a reference for researchers.</p></abstract>\",\"PeriodicalId\":48554,\"journal\":{\"name\":\"Electronic Research Archive\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Research Archive\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3934/era.2023362\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Research Archive","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3934/era.2023362","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

摘要

随着科学技术(特别是具有强大计算能力的计算设备)的不断发展,基于深度学习的图像生成技术也取得了显著的成就。大多数基于深度学习的跨模态技术都可以将文本信息生成为图像,这已经成为当前研究的热点。文本到图像(tt2i)合成技术在计算机视觉的多个领域有应用,如图像增强、人工智能绘画、游戏和虚拟现实。使用生成对抗网络的T2I生成技术可以生成更加逼真和多样化的图像,但也存在一些缺点和挑战,例如难以生成复杂的背景。这篇综述将按以下顺序介绍。首先,我们介绍了基本和经典生成对抗网络(gan)的基本原理和结构。其次,本文将T2I综合方法分为四大类。有基于语义增强的方法、基于递进结构的方法、基于注意的方法和基于引入附加信号的方法。我们选择了一些经典的和最新的T2I方法进行介绍,并解释了它们的主要优点和缺点。第三,解释了T2I领域的基本数据集和评价指标。最后,对今后的研究方向进行了展望。本文系统介绍了GAN的基本方法及其基础上的T2I方法,可供研究人员参考。</p></abstract>
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A survey of generative adversarial networks and their application in text-to-image synthesis

With the continuous development of science and technology (especially computational devices with powerful computing capabilities), the image generation technology based on deep learning has also made significant achievements. Most cross-modal technologies based on deep learning can generate information from text into images, which has become a hot topic of current research. Text-to-image (T2I) synthesis technology has applications in multiple fields of computer vision, such as image enhancement, artificial intelligence painting, games and virtual reality. The T2I generation technology using generative adversarial networks can generate more realistic and diverse images, but there are also some shortcomings and challenges, such as difficulty in generating complex backgrounds. This review will be introduced in the following order. First, we introduce the basic principles and architecture of basic and classic generative adversarial networks (GANs). Second, this review categorizes T2I synthesis methods into four main categories. There are methods based on semantic enhancement, methods based on progressive structure, methods based on attention and methods based on introducing additional signals. We have chosen some of the classic and latest T2I methods for introduction and explain their main advantages and shortcomings. Third, we explain the basic dataset and evaluation indicators in the T2I field. Finally, prospects for future research directions are discussed. This review provides a systematic introduction to the basic GAN method and the T2I method based on it, which can serve as a reference for researchers.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.30
自引率
12.50%
发文量
170
期刊最新文献
On $ p $-Laplacian Kirchhoff-Schrödinger-Poisson type systems with critical growth on the Heisenberg group Fredholm inversion around a singularity: Application to autoregressive time series in Banach space Local well-posedness of perturbed Navier-Stokes system around Landau solutions From basic approaches to novel challenges and applications in Sequential Pattern Mining A preconditioned new modulus-based matrix splitting method for solving linear complementarity problem of $ H_+ $-matrices
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1