首页 > 最新文献

Proceedings of the ACM on computer graphics and interactive techniques最新文献

英文 中文
Effect of Render Resolution on Gameplay Experience, Performance, and Simulator Sickness in Virtual Reality Games 在虚拟现实游戏中,渲染分辨率对游戏体验、性能和模拟器病的影响
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-03-23 DOI: 10.1145/3522610
Jialin Wang, Rongkai Shi, Zehui Xiao, Xueying Qin, Hai-Ning Liang
Higher resolution is one of the main directions and drivers in the development of virtual reality (VR) head-mounted displays (HMDs). However, given its associated higher cost, it is important to determine the benefits of having higher resolution on user experience. For non-VR games, higher resolution is often thought to lead to a better experience, but it is unexplored in VR games. This research aims to investigate the resolution tradeoff in gameplay experience, performance, and simulator sickness (SS) for VR games, particularly first-person shooter (FPS) games. To this end, we designed an experiment to collect gameplay experience, SS, and player performance data with a popular VR FPS game, Half-Life: Alyx. Our results indicate that 2K resolution is an important threshold for an enhanced gameplay experience without affecting performance and increasing SS levels. Moreover, the resolution from 1K to 4K has no significant difference in player performance. Our results can inform game developers and players in determining the type of HMD they want to use to balance the tradeoff between costs and benefits and achieve a more optimal experience.
更高的分辨率是虚拟现实(VR)头戴式显示器(HMD)发展的主要方向和驱动因素之一。然而,考虑到其相关的更高成本,确定具有更高分辨率对用户体验的好处是很重要的。对于非VR游戏,更高的分辨率通常被认为会带来更好的体验,但在VR游戏中尚未探索。本研究旨在调查VR游戏,特别是第一人称射击游戏(FPS)在游戏体验、性能和模拟器疾病(SS)方面的分辨率权衡。为此,我们设计了一个实验,用流行的VR FPS游戏《半条命:Alyx》收集游戏体验、SS和玩家性能数据。我们的结果表明,2K分辨率是在不影响性能和提高SS级别的情况下增强游戏体验的重要阈值。此外,从1K到4K的分辨率在播放器性能上没有显著差异。我们的结果可以为游戏开发者和玩家确定他们想要使用的HMD类型提供信息,以平衡成本和收益之间的权衡,并获得更优化的体验。
{"title":"Effect of Render Resolution on Gameplay Experience, Performance, and Simulator Sickness in Virtual Reality Games","authors":"Jialin Wang, Rongkai Shi, Zehui Xiao, Xueying Qin, Hai-Ning Liang","doi":"10.1145/3522610","DOIUrl":"https://doi.org/10.1145/3522610","url":null,"abstract":"Higher resolution is one of the main directions and drivers in the development of virtual reality (VR) head-mounted displays (HMDs). However, given its associated higher cost, it is important to determine the benefits of having higher resolution on user experience. For non-VR games, higher resolution is often thought to lead to a better experience, but it is unexplored in VR games. This research aims to investigate the resolution tradeoff in gameplay experience, performance, and simulator sickness (SS) for VR games, particularly first-person shooter (FPS) games. To this end, we designed an experiment to collect gameplay experience, SS, and player performance data with a popular VR FPS game, Half-Life: Alyx. Our results indicate that 2K resolution is an important threshold for an enhanced gameplay experience without affecting performance and increasing SS levels. Moreover, the resolution from 1K to 4K has no significant difference in player performance. Our results can inform game developers and players in determining the type of HMD they want to use to balance the tradeoff between costs and benefits and achieve a more optimal experience.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":" ","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2022-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47487786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Bringing Linearly Transformed Cosines to Anisotropic GGX 将线性变换余弦引入各向异性GGX
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-03-22 DOI: 10.1145/3522612
T. AakashK., E. Heitz, J. Dupuy, P J Narayanan
Linearly Transformed Cosines (LTCs) are a family of distributions that are used for real-time area-light shading thanks to their analytic integration properties. Modern game engines use an LTC approximation of the ubiquitous GGX model, but currently this approximation only exists for isotropic GGX and thus anisotropic GGX is not supported. While the higher dimensionality presents a challenge in itself, we show that several additional problems arise when fitting, post-processing, storing, and interpolating LTCs in the anisotropic case. Each of these operations must be done carefully to avoid rendering artifacts. We find robust solutions for each operation by introducing and exploiting invariance properties of LTCs. As a result, we obtain a small 84 look-up table that provides a plausible and artifact-free LTC approximation to anisotropic GGX and brings it to real-time area-light shading.
线性变换余弦(LTCs)是一组分布,由于其分析积分特性,可用于实时区域明暗处理。现代游戏引擎使用普遍存在的GGX模型的LTC近似,但目前这种近似仅存在于各向同性GGX,因此不支持各向异性GGX。虽然更高的维度本身就存在挑战,但我们表明,在各向异性情况下,拟合、后处理、存储和插值LTCs时会出现几个额外的问题。这些操作中的每一个都必须小心完成,以避免呈现伪影。通过引入和利用LTCs的不变性,我们找到了每个运算的鲁棒解。因此,我们获得了一个小的84查找表,该表为各向异性GGX提供了一个合理且无伪影的LTC近似,并将其用于实时区域遮光。
{"title":"Bringing Linearly Transformed Cosines to Anisotropic GGX","authors":"T. AakashK., E. Heitz, J. Dupuy, P J Narayanan","doi":"10.1145/3522612","DOIUrl":"https://doi.org/10.1145/3522612","url":null,"abstract":"Linearly Transformed Cosines (LTCs) are a family of distributions that are used for real-time area-light shading thanks to their analytic integration properties. Modern game engines use an LTC approximation of the ubiquitous GGX model, but currently this approximation only exists for isotropic GGX and thus anisotropic GGX is not supported. While the higher dimensionality presents a challenge in itself, we show that several additional problems arise when fitting, post-processing, storing, and interpolating LTCs in the anisotropic case. Each of these operations must be done carefully to avoid rendering artifacts. We find robust solutions for each operation by introducing and exploiting invariance properties of LTCs. As a result, we obtain a small 84 look-up table that provides a plausible and artifact-free LTC approximation to anisotropic GGX and brings it to real-time area-light shading.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"1 - 18"},"PeriodicalIF":0.0,"publicationDate":"2022-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45950640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Rendering Layered Materials with Diffuse Interfaces 使用漫反射界面渲染分层材质
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-03-22 DOI: 10.1145/3522620
Heloise de Dinechin, Laurent Belcour
In this work, we introduce a novel method to render, in real-time, Lambertian surfaces with a rough dieletric coating. We show that the appearance of such configurations is faithfully represented with two microfacet lobes accounting for direct and indirect interactions respectively. We numerically fit these lobes based on the first order directional statistics (energy, mean and variance) of light transport using 5D tables and narrow them down to 2D + 1D with analytical forms and dimension reduction. We demonstrate the quality of our method by efficiently rendering rough plastics and ceramics, closely matching ground truth. In addition, we improve a state-of-the-art layered material model to include Lambertian interfaces.
在这项工作中,我们介绍了一种使用粗糙电介质涂层实时渲染朗伯表面的新方法。我们表明,这种构型的出现是用两个微平面波瓣忠实地表示的,这两个微面波瓣分别说明了直接和间接的相互作用。我们使用5D表基于光传输的一阶方向统计(能量、平均值和方差)对这些波瓣进行了数值拟合,并通过分析形式和降维将其缩小到2D+1D。我们通过有效渲染粗糙的塑料和陶瓷来证明我们的方法的质量,与实际情况密切匹配。此外,我们改进了最先进的分层材料模型,以包括朗伯界面。
{"title":"Rendering Layered Materials with Diffuse Interfaces","authors":"Heloise de Dinechin, Laurent Belcour","doi":"10.1145/3522620","DOIUrl":"https://doi.org/10.1145/3522620","url":null,"abstract":"In this work, we introduce a novel method to render, in real-time, Lambertian surfaces with a rough dieletric coating. We show that the appearance of such configurations is faithfully represented with two microfacet lobes accounting for direct and indirect interactions respectively. We numerically fit these lobes based on the first order directional statistics (energy, mean and variance) of light transport using 5D tables and narrow them down to 2D + 1D with analytical forms and dimension reduction. We demonstrate the quality of our method by efficiently rendering rough plastics and ceramics, closely matching ground truth. In addition, we improve a state-of-the-art layered material model to include Lambertian interfaces.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":" ","pages":"1 - 12"},"PeriodicalIF":0.0,"publicationDate":"2022-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43678040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Real-Time Style Modelling of Human Locomotion via Feature-Wise Transformations and Local Motion Phases 基于特征变换和局部运动相位的人体运动实时建模
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-01-12 DOI: 10.1145/3522618
I. Mason, S. Starke, T. Komura
Controlling the manner in which a character moves in a real-time animation system is a challenging task with useful applications. Existing style transfer systems require access to a reference content motion clip, however, in real-time systems the future motion content is unknown and liable to change with user input. In this work we present a style modelling system that uses an animation synthesis network to model motion content based on local motion phases. An additional style modulation network uses feature-wise transformations to modulate style in real-time. To evaluate our method, we create and release a new style modelling dataset, 100STYLE, containing over 4 million frames of stylised locomotion data in 100 different styles that present a number of challenges for existing systems. To model these styles, we extend the local phase calculation with a contact-free formulation. In comparison to other methods for real-time style modelling, we show our system is more robust and efficient in its style representation while improving motion quality.
控制角色在实时动画系统中的移动方式是一项具有挑战性的任务。现有的风格转移系统需要访问一个参考内容的运动剪辑,然而,在实时系统中,未来的运动内容是未知的,并且容易随着用户输入而改变。在这项工作中,我们提出了一个风格建模系统,该系统使用动画合成网络来基于局部运动阶段建模运动内容。另一个样式调制网络使用特征转换实时调制样式。为了评估我们的方法,我们创建并发布了一个新的风格建模数据集100STYLE,其中包含100种不同风格的400多万帧风格化运动数据,这些数据对现有系统提出了许多挑战。为了模拟这些类型,我们用无接触公式扩展了局部相位计算。与其他实时风格建模方法相比,我们的系统在风格表示方面更具鲁棒性和效率,同时提高了运动质量。
{"title":"Real-Time Style Modelling of Human Locomotion via Feature-Wise Transformations and Local Motion Phases","authors":"I. Mason, S. Starke, T. Komura","doi":"10.1145/3522618","DOIUrl":"https://doi.org/10.1145/3522618","url":null,"abstract":"Controlling the manner in which a character moves in a real-time animation system is a challenging task with useful applications. Existing style transfer systems require access to a reference content motion clip, however, in real-time systems the future motion content is unknown and liable to change with user input. In this work we present a style modelling system that uses an animation synthesis network to model motion content based on local motion phases. An additional style modulation network uses feature-wise transformations to modulate style in real-time. To evaluate our method, we create and release a new style modelling dataset, 100STYLE, containing over 4 million frames of stylised locomotion data in 100 different styles that present a number of challenges for existing systems. To model these styles, we extend the local phase calculation with a contact-free formulation. In comparison to other methods for real-time style modelling, we show our system is more robust and efficient in its style representation while improving motion quality.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":" ","pages":"1 - 18"},"PeriodicalIF":0.0,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46605566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
FaceType: Crafting Written Impressions of Spoken Expression FaceType:制作口头表达的书面印象
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-01-01 DOI: 10.1145/3533385
Kevin Maher, Fan Xiang, Liang Zhi
FaceType is an interactive installation that creates an experience of spoken communication through generated text. Inspired by Chinese calligraphy, the project transforms our spoken expression into handwriting. FaceType explores what parts of our spoken expression can be evoked in writing, and what the most natural form of interaction between the two might be. The work is aimed to allow lay audiences to experience emotion, emphasis, and critical information in speech. Further audience reflection about patterns in their expression and the role of unconscious and conscious expression provide new directions for further works.
FaceType是一种交互式装置,通过生成的文本创造一种口头交流的体验。受中国书法的启发,该项目将我们的口头表达转化为手写。FaceType探索的是我们口语表达的哪些部分可以用书面形式表达出来,以及这两者之间最自然的互动形式是什么。这项工作的目的是让外行观众在演讲中体验情感、强调和关键信息。观众对其表达模式的进一步思考以及无意识和有意识表达的作用为进一步的作品提供了新的方向。
{"title":"FaceType: Crafting Written Impressions of Spoken Expression","authors":"Kevin Maher, Fan Xiang, Liang Zhi","doi":"10.1145/3533385","DOIUrl":"https://doi.org/10.1145/3533385","url":null,"abstract":"FaceType is an interactive installation that creates an experience of spoken communication through generated text. Inspired by Chinese calligraphy, the project transforms our spoken expression into handwriting. FaceType explores what parts of our spoken expression can be evoked in writing, and what the most natural form of interaction between the two might be. The work is aimed to allow lay audiences to experience emotion, emphasis, and critical information in speech. Further audience reflection about patterns in their expression and the role of unconscious and conscious expression provide new directions for further works.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"38:1-38:9"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64051877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCGrid: An Adaptive Grid Structure for Memory-Constrained Fluid Simulation on the GPU 基于GPU的内存约束流体仿真的自适应网格结构
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-01-01 DOI: 10.1145/3522608
Wouter Raateland, Torsten Hädrich, Jorge Alejandro Amador Herrera, D. Banuti, Wojciech Palubicki, S. Pirk, K. Hildebrandt, D. Michels
We introduce Dynamic Constrained Grid (DCGrid), a hierarchical and adaptive grid structure for fluid simulation combined with a scheme for effectively managing the grid adaptations. DCGrid is designed to be implemented on the GPU and used in high-performance simulations. Specifically, it allows us to efficiently vary and adjust the grid resolution across the spatial domain and to rapidly evaluate local stencils and individual cells in a GPU implementation. A special feature of DCGrid is that the control of the grid adaption is modeled as an optimization under a constraint on the maximum available memory, which addresses the memory limitations in GPU-based simulation. To further advance the use of DCGrid in high-performance simulations, we complement DCGrid with an efficient scheme for approximating collisions between fluids and static solids on cells with different resolutions. We demonstrate the effectiveness of DCGrid for smoke flows and complex cloud simulations in which terrain-atmosphere interaction requires working with cells of varying resolution and rapidly changing conditions. Finally, we compare the performance of DCGrid to that of alternative adaptive grid structures.
本文介绍了动态约束网格(DCGrid),一种用于流体模拟的分层自适应网格结构,并结合了一种有效管理网格自适应的方案。DCGrid被设计为在GPU上实现并用于高性能模拟。具体来说,它允许我们有效地改变和调整整个空间域的网格分辨率,并在GPU实现中快速评估局部模板和单个单元。DCGrid的一个特别之处在于网格自适应的控制被建模为在最大可用内存约束下的优化,这解决了基于gpu的仿真中的内存限制问题。为了进一步推进DCGrid在高性能模拟中的应用,我们在DCGrid的基础上补充了一种有效的方案,用于模拟不同分辨率的细胞上流体和静态固体之间的碰撞。我们展示了DCGrid在烟雾流和复杂云模拟中的有效性,其中地形-大气相互作用需要与不同分辨率和快速变化条件的单元一起工作。最后,我们比较了DCGrid与其他自适应网格结构的性能。
{"title":"DCGrid: An Adaptive Grid Structure for Memory-Constrained Fluid Simulation on the GPU","authors":"Wouter Raateland, Torsten Hädrich, Jorge Alejandro Amador Herrera, D. Banuti, Wojciech Palubicki, S. Pirk, K. Hildebrandt, D. Michels","doi":"10.1145/3522608","DOIUrl":"https://doi.org/10.1145/3522608","url":null,"abstract":"We introduce Dynamic Constrained Grid (DCGrid), a hierarchical and adaptive grid structure for fluid simulation combined with a scheme for effectively managing the grid adaptations. DCGrid is designed to be implemented on the GPU and used in high-performance simulations. Specifically, it allows us to efficiently vary and adjust the grid resolution across the spatial domain and to rapidly evaluate local stencils and individual cells in a GPU implementation. A special feature of DCGrid is that the control of the grid adaption is modeled as an optimization under a constraint on the maximum available memory, which addresses the memory limitations in GPU-based simulation. To further advance the use of DCGrid in high-performance simulations, we complement DCGrid with an efficient scheme for approximating collisions between fluids and static solids on cells with different resolutions. We demonstrate the effectiveness of DCGrid for smoke flows and complex cloud simulations in which terrain-atmosphere interaction requires working with cells of varying resolution and rapidly changing conditions. Finally, we compare the performance of DCGrid to that of alternative adaptive grid structures.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"3:1-3:14"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64049574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PLOC++: Parallel Locally-Ordered Clustering for Bounding Volume Hierarchy Construction Revisited 并行局部有序聚类在边界体层次结构中的应用
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2022-01-01 DOI: 10.1145/3543867
Carsten Benthin, R. Drabinski, Lorenzo Tessari, Addis Dittebrandt
We propose a novel version of the GPU-oriented massively parallel locally-ordered clustering ( PLOC ) algorithm for constructing bounding volume hierarchies (BVHs). Our method focuses on removing the weaknesses of the original approach by simplifying and fusing different phases, while replacing most performance critical parts by novel and more efficient algorithms. This combination allows for outperforming the original approach by a factor of 1 . 9 − 2 . 3 × .
我们提出了一种新的面向gpu的大规模并行局部有序聚类(PLOC)算法,用于构建边界卷层次结构(BVHs)。我们的方法侧重于通过简化和融合不同阶段来消除原始方法的弱点,同时用新颖和更有效的算法替换大多数性能关键部分。这种组合可以使原始方法的性能高出1倍。9−2。3;
{"title":"PLOC++: Parallel Locally-Ordered Clustering for Bounding Volume Hierarchy Construction Revisited","authors":"Carsten Benthin, R. Drabinski, Lorenzo Tessari, Addis Dittebrandt","doi":"10.1145/3543867","DOIUrl":"https://doi.org/10.1145/3543867","url":null,"abstract":"We propose a novel version of the GPU-oriented massively parallel locally-ordered clustering ( PLOC ) algorithm for constructing bounding volume hierarchies (BVHs). Our method focuses on removing the weaknesses of the original approach by simplifying and fusing different phases, while replacing most performance critical parts by novel and more efficient algorithms. This combination allows for outperforming the original approach by a factor of 1 . 9 − 2 . 3 × .","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"31:1-31:13"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64052687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation 表达性语音驱动3D面部动画的联合音频-文本模型
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2021-12-04 DOI: 10.1145/3522615
Yingruo Fan, Zhaojiang Lin, Jun Saito, Wenping Wang, T. Komura
Speech-driven 3D facial animation with accurate lip synchronization has been widely studied. However, synthesizing realistic motions for the entire face during speech has rarely been explored. In this work, we present a joint audio-text model to capture the contextual information for expressive speech-driven 3D facial animation. The existing datasets are collected to cover as many different phonemes as possible instead of sentences, thus limiting the capability of the audio-based model to learn more diverse contexts. To address this, we propose to leverage the contextual text embeddings extracted from the powerful pre-trained language model that has learned rich contextual representations from large-scale text data. Our hypothesis is that the text features can disambiguate the variations in upper face expressions, which are not strongly correlated with the audio. In contrast to prior approaches which learn phoneme-level features from the text, we investigate the high-level contextual text features for speech-driven 3D facial animation. We show that the combined acoustic and textual modalities can synthesize realistic facial expressions while maintaining audio-lip synchronization. We conduct the quantitative and qualitative evaluations as well as the perceptual user study. The results demonstrate the superior performance of our model against existing state-of-the-art approaches.
具有精确嘴唇同步的语音驱动的3D面部动画已经被广泛研究。然而,在演讲过程中为整个面部合成逼真的运动很少被探索。在这项工作中,我们提出了一个联合音频文本模型来捕捉上下文信息,用于表达语音驱动的3D面部动画。收集现有的数据集是为了覆盖尽可能多的不同音素而不是句子,从而限制了基于音频的模型学习更多不同上下文的能力。为了解决这一问题,我们建议利用从强大的预训练语言模型中提取的上下文文本嵌入,该模型已经从大规模文本数据中学习了丰富的上下文表示。我们的假设是,文本特征可以消除上脸表情的变化,而上脸表情与音频的相关性并不强。与从文本中学习音素级别特征的现有方法相比,我们研究了语音驱动的3D面部动画的高级上下文文本特征。我们表明,组合的声学和文本模态可以合成逼真的面部表情,同时保持音频嘴唇同步。我们进行了定量和定性评估以及感知用户研究。结果表明,与现有的最先进的方法相比,我们的模型具有优越的性能。
{"title":"Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation","authors":"Yingruo Fan, Zhaojiang Lin, Jun Saito, Wenping Wang, T. Komura","doi":"10.1145/3522615","DOIUrl":"https://doi.org/10.1145/3522615","url":null,"abstract":"Speech-driven 3D facial animation with accurate lip synchronization has been widely studied. However, synthesizing realistic motions for the entire face during speech has rarely been explored. In this work, we present a joint audio-text model to capture the contextual information for expressive speech-driven 3D facial animation. The existing datasets are collected to cover as many different phonemes as possible instead of sentences, thus limiting the capability of the audio-based model to learn more diverse contexts. To address this, we propose to leverage the contextual text embeddings extracted from the powerful pre-trained language model that has learned rich contextual representations from large-scale text data. Our hypothesis is that the text features can disambiguate the variations in upper face expressions, which are not strongly correlated with the audio. In contrast to prior approaches which learn phoneme-level features from the text, we investigate the high-level contextual text features for speech-driven 3D facial animation. We show that the combined acoustic and textual modalities can synthesize realistic facial expressions while maintaining audio-lip synchronization. We conduct the quantitative and qualitative evaluations as well as the perceptual user study. The results demonstrate the superior performance of our model against existing state-of-the-art approaches.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2021-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43710250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Supporting Unified Shader Specialization by Co-opting C++ Features 通过吸收c++特性来支持统一的着色器专门化
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2021-09-29 DOI: 10.1145/3543866
Kerry A. Seitz, Theresa Foley, Serban D. Porumbescu, John Douglas Owens
Modern unified programming models (such as CUDA and SYCL) that combine host (CPU) code and GPU code into the same programming language, same file, and same lexical scope lack adequate support for GPU code specialization, which is a key optimization in real-time graphics. Furthermore, current methods used to implement specialization do not translate to a unified environment. In this paper, we create a unified shader programming environment in C++ that provides first-class support for specialization by co-opting C++'s attribute and virtual function features and reimplementing them with alternate semantics to express the services required. By co-opting existing features, we enable programmers to use familiar C++ programming techniques to write host and GPU code together, while still achieving efficient generated C++ and HLSL code via our source-to-source translator.
现代统一编程模型(如CUDA和SYCL)将主机(CPU)代码和GPU代码组合到相同的编程语言、相同的文件和相同的词法范围中,缺乏对GPU代码专业化的充分支持,而GPU代码专门化是实时图形的关键优化。此外,目前用于实现专业化的方法并不能转化为统一的环境。在本文中,我们在C++中创建了一个统一的着色器编程环境,通过选择C++的属性和虚拟函数特性,并用替代语义重新实现它们,以表达所需的服务,为专业化提供一流的支持。通过选择现有功能,我们使程序员能够使用熟悉的C++编程技术一起编写主机和GPU代码,同时通过我们的源代码到源代码转换器仍然可以实现高效生成的C++和HLSL代码。
{"title":"Supporting Unified Shader Specialization by Co-opting C++ Features","authors":"Kerry A. Seitz, Theresa Foley, Serban D. Porumbescu, John Douglas Owens","doi":"10.1145/3543866","DOIUrl":"https://doi.org/10.1145/3543866","url":null,"abstract":"Modern unified programming models (such as CUDA and SYCL) that combine host (CPU) code and GPU code into the same programming language, same file, and same lexical scope lack adequate support for GPU code specialization, which is a key optimization in real-time graphics. Furthermore, current methods used to implement specialization do not translate to a unified environment. In this paper, we create a unified shader programming environment in C++ that provides first-class support for specialization by co-opting C++'s attribute and virtual function features and reimplementing them with alternate semantics to express the services required. By co-opting existing features, we enable programmers to use familiar C++ programming techniques to write host and GPU code together, while still achieving efficient generated C++ and HLSL code via our source-to-source translator.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"5 1","pages":"1 - 17"},"PeriodicalIF":0.0,"publicationDate":"2021-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45898837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient acoustic perception for virtual AI agents 虚拟AI智能体的高效声学感知
Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2021-09-27 DOI: 10.1145/3480139
Michael Chemistruck, Andrew Allen, John M. Snyder, N. Raghuvanshi
We model acoustic perception in AI agents efficiently within complex scenes with many sound events. The key idea is to employ perceptual parameters that capture how each sound event propagates through the scene to the agent's location. This naturally conforms virtual perception to human. We propose a simplified auditory masking model that limits localization capability in the presence of distracting sounds. We show that anisotropic reflections as well as the initial sound serve as useful localization cues. Our system is simple, fast, and modular and obtains natural results in our tests, letting agents navigate through passageways and portals by sound alone, and anticipate or track occluded but audible targets. Source code is provided.
我们在具有许多声音事件的复杂场景中高效地对人工智能代理中的声学感知进行建模。关键思想是使用感知参数来捕捉每个声音事件如何通过场景传播到代理的位置。这自然符合虚拟感知对人类的要求。我们提出了一种简化的听觉掩蔽模型,该模型在存在分散注意力的声音的情况下限制了定位能力。我们证明了各向异性反射以及初始声音是有用的定位线索。我们的系统简单、快速、模块化,在测试中获得自然结果,让特工仅通过声音在通道和入口中导航,并预测或跟踪被遮挡但可听见的目标。提供了源代码。
{"title":"Efficient acoustic perception for virtual AI agents","authors":"Michael Chemistruck, Andrew Allen, John M. Snyder, N. Raghuvanshi","doi":"10.1145/3480139","DOIUrl":"https://doi.org/10.1145/3480139","url":null,"abstract":"We model acoustic perception in AI agents efficiently within complex scenes with many sound events. The key idea is to employ perceptual parameters that capture how each sound event propagates through the scene to the agent's location. This naturally conforms virtual perception to human. We propose a simplified auditory masking model that limits localization capability in the presence of distracting sounds. We show that anisotropic reflections as well as the initial sound serve as useful localization cues. Our system is simple, fast, and modular and obtains natural results in our tests, letting agents navigate through passageways and portals by sound alone, and anticipate or track occluded but audible targets. Source code is provided.","PeriodicalId":74536,"journal":{"name":"Proceedings of the ACM on computer graphics and interactive techniques","volume":"4 1","pages":"1 - 13"},"PeriodicalIF":0.0,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45839589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the ACM on computer graphics and interactive techniques
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1