Highly realistic synthetic dataset for pixel-level DensePose estimation via diffusion model

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2024-11-06 DOI:10.1016/j.patcog.2024.111137
Jiaxiao Wen, Tao Chu, Qiong Liu
{"title":"Highly realistic synthetic dataset for pixel-level DensePose estimation via diffusion model","authors":"Jiaxiao Wen,&nbsp;Tao Chu,&nbsp;Qiong Liu","doi":"10.1016/j.patcog.2024.111137","DOIUrl":null,"url":null,"abstract":"<div><div>Generating training data with pixel-level annotations for DensePose is a labor-intensive task, resulting in sparse labeling in real-world datasets. Prior solutions have relied on specialized data generation systems to synthesize datasets. However, these synthetic datasets often lack realism and rely on expensive resources such as human body models and texture mappings. In this paper, we address these challenges by introducing a novel data generation method based on the diffusion model, effectively producing highly realistic data without the need for expensive resources. Specifically, our method comprises annotation generation and image generation. Utilizing graphic renderers and SMPL models, we produce synthetic annotations solely based on human poses and shapes. Subsequently, guided by these annotations, we employ simple yet effective textual prompts to generate a wide range of realistic images using the diffusion model. Our experiments conducted on DensePose-COCO dataset demonstrate the superiority of our method compared to existing methods. Code and benchmarks will be released.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111137"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008884","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Generating training data with pixel-level annotations for DensePose is a labor-intensive task, resulting in sparse labeling in real-world datasets. Prior solutions have relied on specialized data generation systems to synthesize datasets. However, these synthetic datasets often lack realism and rely on expensive resources such as human body models and texture mappings. In this paper, we address these challenges by introducing a novel data generation method based on the diffusion model, effectively producing highly realistic data without the need for expensive resources. Specifically, our method comprises annotation generation and image generation. Utilizing graphic renderers and SMPL models, we produce synthetic annotations solely based on human poses and shapes. Subsequently, guided by these annotations, we employ simple yet effective textual prompts to generate a wide range of realistic images using the diffusion model. Our experiments conducted on DensePose-COCO dataset demonstrate the superiority of our method compared to existing methods. Code and benchmarks will be released.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过扩散模型估算像素级 DensePose 的高逼真度合成数据集
为 DensePose 生成带有像素级注释的训练数据是一项劳动密集型任务,导致真实世界数据集中的标签稀疏。先前的解决方案依赖于专门的数据生成系统来合成数据集。然而,这些合成数据集往往缺乏真实感,而且依赖于昂贵的资源,如人体模型和纹理映射。在本文中,我们引入了一种基于扩散模型的新型数据生成方法,无需昂贵的资源即可有效生成高度逼真的数据,从而解决了这些难题。具体来说,我们的方法包括注释生成和图像生成。利用图形渲染器和 SMPL 模型,我们仅根据人的姿势和形状生成合成注释。随后,在这些注释的指导下,我们采用简单而有效的文字提示,利用扩散模型生成各种逼真的图像。我们在 DensePose-COCO 数据集上进行的实验证明,与现有方法相比,我们的方法更胜一筹。我们将发布代码和基准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
期刊最新文献
Learning accurate and enriched features for stereo image super-resolution Semi-supervised multi-view feature selection with adaptive similarity fusion and learning DyConfidMatch: Dynamic thresholding and re-sampling for 3D semi-supervised learning CAST: An innovative framework for Cross-dimensional Attention Structure in Transformers Embedded feature selection for robust probability learning machines
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1