Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality

Yu-Chih Chen;Avinab Saha;Alexandre Chapiro;Christian Häne;Jean-Charles Bazin;Bo Qiu;Stefano Zanetti;Ioannis Katsavounidis;Alan C. Bovik
{"title":"Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality","authors":"Yu-Chih Chen;Avinab Saha;Alexandre Chapiro;Christian Häne;Jean-Charles Bazin;Bo Qiu;Stefano Zanetti;Ioannis Katsavounidis;Alan C. Bovik","doi":"10.1109/TIP.2024.3468881","DOIUrl":null,"url":null,"abstract":"We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as “holograms” in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios. During transmission over the internet, the perceived quality of compressed human avatar videos can be severely impaired by visual artifacts. To optimize trade-offs between perceptual quality and data volume in practical workflows, video quality assessment (VQA) models are essential tools. However, there are very few VQA algorithms developed specifically to analyze human body avatar videos, due, at least in part, to the dearth of appropriate and comprehensive datasets of adequate size. Towards filling this gap, we introduce the LIVE-Meta Rendered Human Avatar VQA Database, which contains 720 human avatar videos processed using 20 different combinations of encoding parameters, labeled by corresponding human perceptual quality judgments that were collected in six degrees of freedom VR headsets. To demonstrate the usefulness of this new and unique video resource, we use it to study and compare the performances of a variety of state-of-the-art Full Reference and No Reference video quality prediction models, including a new model called HoloQA. As a service to the research community, we publicly releases the metadata of the new database at \n<uri>https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5740-5754"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10704572/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as “holograms” in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios. During transmission over the internet, the perceived quality of compressed human avatar videos can be severely impaired by visual artifacts. To optimize trade-offs between perceptual quality and data volume in practical workflows, video quality assessment (VQA) models are essential tools. However, there are very few VQA algorithms developed specifically to analyze human body avatar videos, due, at least in part, to the dearth of appropriate and comprehensive datasets of adequate size. Towards filling this gap, we introduce the LIVE-Meta Rendered Human Avatar VQA Database, which contains 720 human avatar videos processed using 20 different combinations of encoding parameters, labeled by corresponding human perceptual quality judgments that were collected in six degrees of freedom VR headsets. To demonstrate the usefulness of this new and unique video resource, we use it to study and compare the performances of a variety of state-of-the-art Full Reference and No Reference video quality prediction models, including a new model called HoloQA. As a service to the research community, we publicly releases the metadata of the new database at https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
虚拟现实中渲染人类头像视频的主观和客观质量评估
我们研究了人类受试者对数字人类化身(有时在虚拟现实 [VR] 和增强现实 [AR] 系统中被称为 "全息图")的视觉质量判断,这些化身都受到了扭曲。我们还研究了视频质量模型预测人类判断的能力。随着 VR 或 AR 中的人类头像视频流变得越来越普遍,我们需要更先进的人类头像视频压缩协议,以解决在忠实传输高质量视觉呈现的同时又能适应多变带宽场景之间的权衡问题。在互联网传输过程中,压缩后的人类头像视频的感知质量可能会受到视觉伪影的严重影响。为了在实际工作流程中优化感知质量和数据量之间的权衡,视频质量评估(VQA)模型是必不可少的工具。然而,专门为分析人体头像视频而开发的 VQA 算法却寥寥无几,至少部分原因是缺乏适当规模的合适综合数据集。为了填补这一空白,我们引入了 LIVE-Meta 渲染人体头像 VQA 数据库,该数据库包含 720 个使用 20 种不同编码参数组合处理的人体头像视频,并标注了相应的人类感知质量判断,这些判断是在六自由度 VR 头显中收集的。为了证明这一新的、独特的视频资源的实用性,我们利用它来研究和比较各种最先进的 "完全参考 "和 "无参考 "视频质量预测模型(包括名为 HoloQA 的新模型)的性能。作为对研究界的一项服务,我们在 https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html 网站上公开发布了新数据库的元数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning Cross-Attention Point Transformer With Global Porous Sampling Salient Object Detection From Arbitrary Modalities GSSF: Generalized Structural Sparse Function for Deep Cross-Modal Metric Learning AnlightenDiff: Anchoring Diffusion Probabilistic Model on Low Light Image Enhancement Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1