ACM Transactions on Applied Perceptions (TAP)最新文献

Creating Word Paintings Jointly Considering Semantics, Attention, and Aesthetics 语义学、注意学、美学共同创作文字画

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-07-04 DOI: 10.1145/3539610

Junsong Zhang, Zuyi Yang, Li Jin, Zhitang Lu, Jinhui Yu

In this article, we present a content-aware method for generating a word painting. Word painting is a composite artwork made from the assemblage of words extracted from a given text, which carries similar semantics and visual features to a given source image. However, word painting, usually created by skilled artists, involves tedious manual processes, especially when generating streamlines and laying out text. Hence, we provide an easy method to create word paintings for users. How to design textural layout that simultaneously conveys the input image and enables easy access to the semantic theme is the key challenge to generating a visually pleasing word painting. To address this issue, given an image and its content-related text, we first decompose the input image into several regions and approximate each region with a smooth vector field. At the same time, by analyzing the input text, we extract some weighted keywords as the graphic elements. Then, to measure the likelihood of positions in the input image that attract the observers’ attention, we generate a saliency map with our trained visual attention model. Finally, jointly considering visual attention and aesthetic rules, we propose an energy-based optimization framework to arrange extracted keywords into the decomposed regions and synthesize a word painting. Experimental results and user studies show that this method is able to generate a fashionable and appealing word painting.

在本文中，我们提出了一种内容感知的方法来生成单词绘画。文字绘画是从给定文本中提取的文字组合而成的一种复合艺术作品，它具有与给定源图像相似的语义和视觉特征。然而，文字绘画通常由熟练的艺术家创作，涉及繁琐的手工过程，特别是在生成流线和布局文本时。因此，我们为用户提供了一种简单的创建字画的方法。如何设计纹理布局，同时传达输入的图像，使易于访问的语义主题是产生视觉上令人愉悦的文字绘画的关键挑战。为了解决这个问题，给定一个图像及其内容相关的文本，我们首先将输入图像分解为几个区域，并用光滑向量场近似每个区域。同时，通过对输入文本的分析，提取一些加权关键词作为图形元素。然后，为了测量输入图像中吸引观察者注意的位置的可能性，我们使用训练好的视觉注意模型生成显著性图。最后，结合视觉注意和审美规律，提出了一种基于能量的优化框架，将提取的关键词排列到分解区域中，合成一幅词画。实验结果和用户研究表明，该方法能够生成时尚、吸引人的文字绘画。

{"title":"Creating Word Paintings Jointly Considering Semantics, Attention, and Aesthetics","authors":"Junsong Zhang, Zuyi Yang, Li Jin, Zhitang Lu, Jinhui Yu","doi":"10.1145/3539610","DOIUrl":"https://doi.org/10.1145/3539610","url":null,"abstract":"In this article, we present a content-aware method for generating a word painting. Word painting is a composite artwork made from the assemblage of words extracted from a given text, which carries similar semantics and visual features to a given source image. However, word painting, usually created by skilled artists, involves tedious manual processes, especially when generating streamlines and laying out text. Hence, we provide an easy method to create word paintings for users. How to design textural layout that simultaneously conveys the input image and enables easy access to the semantic theme is the key challenge to generating a visually pleasing word painting. To address this issue, given an image and its content-related text, we first decompose the input image into several regions and approximate each region with a smooth vector field. At the same time, by analyzing the input text, we extract some weighted keywords as the graphic elements. Then, to measure the likelihood of positions in the input image that attract the observers’ attention, we generate a saliency map with our trained visual attention model. Finally, jointly considering visual attention and aesthetic rules, we propose an energy-based optimization framework to arrange extracted keywords into the decomposed regions and synthesize a word painting. Experimental results and user studies show that this method is able to generate a fashionable and appealing word painting.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133813912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Exploring Sonification Mapping Strategies for Spatial Auditory Guidance in Immersive Virtual Environments 沉浸式虚拟环境中空间听觉引导的声化映射策略研究

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-05-31 DOI: 10.1145/3528171

Zihan Gao, Hui-qiang Wang, Guangsheng Feng, Hongwu Lv

Spatial auditory cues are important for many tasks in immersive virtual environments, especially guidance tasks. However, due to the limited fidelity of spatial sounds rendered by generic Head-Related Transfer Functions (HRTFs), sound localization usually has a limited accuracy, especially in elevation, which can potentially impact the effectiveness of auditory guidance. To address this issue, we explored whether integrating sonification with spatial audio can enhance the perceptions of auditory guidance cues so user performance in auditory guidance tasks can be improved. Specifically, we investigated the effects of sonification mapping strategy using a controlled experiment that compared four elevation sonification mapping strategies: absolute elevation mapping, unsigned relative elevation mapping, signed relative elevation mapping, and binary relative elevation mapping. In addition, we examined whether azimuth sonification mapping can further benefit the perception of spatial sounds. The results demonstrate that spatial auditory cues can be effectively enhanced by integrating elevation and azimuth sonification, where the accuracy and speed of guidance tasks can be significantly improved. In particular, the overall results suggest that binary relative elevation mapping is generally the most effective strategy among four elevation sonification mapping strategies, which indicates that auditory cues with clear directional information are key to efficient auditory guidance.

空间听觉线索在沉浸式虚拟环境中的许多任务中都很重要，尤其是引导任务。然而，由于一般头部相关传递函数(hrtf)呈现的空间声音保真度有限，声音定位通常具有有限的精度，特别是在仰角，这可能会影响听觉引导的有效性。为了解决这个问题，我们探讨了将超声与空间音频相结合是否可以增强对听觉引导线索的感知，从而提高用户在听觉引导任务中的表现。具体来说，我们通过一个对照实验来研究超声制图策略的效果，比较了四种高程超声制图策略:绝对高程制图、无符号相对高程制图、符号相对高程制图和二进制相对高程制图。此外，我们还研究了方位角声化映射是否能进一步促进空间声音的感知。结果表明，将俯仰角和方位超声相结合，可以有效增强空间听觉线索，显著提高制导任务的精度和速度。总体结果表明，在四种高程声化映射策略中，二元相对高程映射通常是最有效的策略，这表明具有明确方向信息的听觉线索是有效听觉引导的关键。

{"title":"Exploring Sonification Mapping Strategies for Spatial Auditory Guidance in Immersive Virtual Environments","authors":"Zihan Gao, Hui-qiang Wang, Guangsheng Feng, Hongwu Lv","doi":"10.1145/3528171","DOIUrl":"https://doi.org/10.1145/3528171","url":null,"abstract":"Spatial auditory cues are important for many tasks in immersive virtual environments, especially guidance tasks. However, due to the limited fidelity of spatial sounds rendered by generic Head-Related Transfer Functions (HRTFs), sound localization usually has a limited accuracy, especially in elevation, which can potentially impact the effectiveness of auditory guidance. To address this issue, we explored whether integrating sonification with spatial audio can enhance the perceptions of auditory guidance cues so user performance in auditory guidance tasks can be improved. Specifically, we investigated the effects of sonification mapping strategy using a controlled experiment that compared four elevation sonification mapping strategies: absolute elevation mapping, unsigned relative elevation mapping, signed relative elevation mapping, and binary relative elevation mapping. In addition, we examined whether azimuth sonification mapping can further benefit the perception of spatial sounds. The results demonstrate that spatial auditory cues can be effectively enhanced by integrating elevation and azimuth sonification, where the accuracy and speed of guidance tasks can be significantly improved. In particular, the overall results suggest that binary relative elevation mapping is generally the most effective strategy among four elevation sonification mapping strategies, which indicates that auditory cues with clear directional information are key to efficient auditory guidance.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116585956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Vibrotactile Threshold Measurements at the Wrist Using Parallel Vibration Actuators 使用平行振动致动器的腕部振动触觉阈值测量

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-05-27 DOI: 10.1145/3529259

Elvar Atli Ævarsson, Thórhildur Ásgeirsdóttir, Finnur Pind, Á. Kristjánsson, Runar Unnthorsson

This article presents an investigation into the perceptual vibrotactile thresholds for a range of frequencies on both the inside and outside areas of the wrist when exciting the skin with parallel vibrations, realized using the L5 actuator made by Lofelt GmbH. The vibrotactile threshold of 30 participants was measured using a modified audiometry test for the frequency range of 25–1,000 Hz. The average threshold across the respective frequencies was then ultimately determined from acceleration minima. The results show that maximum sensitivity lies in the range of 100–275 Hz (peaking at 200 Hz) for the inside and 75–250 Hz (peaking at 125 Hz) for the outside of the wrist and that thresholds are overall higher for the hairy skin on the outside of the wrist than for the glabrous skin on the inside. The results also show that the vibrotactile thresholds varied highly between individuals. Hence, personalized threshold measurements at the actuator locations will be required to fine-tune a device for the user. This study is a part of an ongoing research and development project where the aim is to develop a tactile display device and a music encoding scheme with the purpose of augmenting the musical enjoyment of cochlear implant recipients. These results, along with results from planned follow-up experiments, will be used to determine the appropriate frequency range and to cast light on the dynamic range on offer for the tactile device.

本文介绍了一项研究，当用平行振动刺激皮肤时，手腕内外区域在一定频率范围内的感知振动触觉阈值，使用Lofelt公司制造的L5致动器实现。30名参与者的振动触觉阈值是用改良的听力学测试在25 - 1000赫兹的频率范围内测量的。各个频率的平均阈值最终由加速度最小值确定。结果表明，手腕内侧的最大灵敏度范围为100-275 Hz(峰值为200 Hz)，手腕外侧的最大灵敏度范围为75-250 Hz(峰值为125 Hz)，手腕外侧有毛皮肤的阈值总体上高于手腕内侧无毛皮肤的阈值。结果还表明，个体之间的振动触觉阈值差异很大。因此，需要在执行器位置进行个性化阈值测量，以便为用户微调设备。这项研究是一个正在进行的研究和开发项目的一部分，该项目的目的是开发一种触觉显示设备和一种音乐编码方案，目的是增加耳蜗植入者的音乐享受。这些结果，以及计划中的后续实验的结果，将用于确定适当的频率范围，并阐明触觉设备提供的动态范围。

{"title":"Vibrotactile Threshold Measurements at the Wrist Using Parallel Vibration Actuators","authors":"Elvar Atli Ævarsson, Thórhildur Ásgeirsdóttir, Finnur Pind, Á. Kristjánsson, Runar Unnthorsson","doi":"10.1145/3529259","DOIUrl":"https://doi.org/10.1145/3529259","url":null,"abstract":"This article presents an investigation into the perceptual vibrotactile thresholds for a range of frequencies on both the inside and outside areas of the wrist when exciting the skin with parallel vibrations, realized using the L5 actuator made by Lofelt GmbH. The vibrotactile threshold of 30 participants was measured using a modified audiometry test for the frequency range of 25–1,000 Hz. The average threshold across the respective frequencies was then ultimately determined from acceleration minima. The results show that maximum sensitivity lies in the range of 100–275 Hz (peaking at 200 Hz) for the inside and 75–250 Hz (peaking at 125 Hz) for the outside of the wrist and that thresholds are overall higher for the hairy skin on the outside of the wrist than for the glabrous skin on the inside. The results also show that the vibrotactile thresholds varied highly between individuals. Hence, personalized threshold measurements at the actuator locations will be required to fine-tune a device for the user. This study is a part of an ongoing research and development project where the aim is to develop a tactile display device and a music encoding scheme with the purpose of augmenting the musical enjoyment of cochlear implant recipients. These results, along with results from planned follow-up experiments, will be used to determine the appropriate frequency range and to cast light on the dynamic range on offer for the tactile device.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123316116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Machine Learning–based Modeling and Prediction of the Intrinsic Relationship between Human Emotion and Music 基于机器学习的人类情感与音乐内在关系建模与预测

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-05-23 DOI: 10.1145/3534966

Jun Su, Pengcheng Zhou

Human emotion is one of the most complex psychophysiological phenomena and has been reported to be affected significantly by music listening. It is supposed that there is an intrinsic relationship between human emotion and music, which can be modeled and predicted quantitatively in a supervised manner. Here, a heuristic clustering analysis is carried out on large-scale free music archive to derive a genre-diverse music library, to which the emotional response of participants is measured using a standard protocol, consequently resulting in a systematic emotion-to-music profile. Eight machine learning methods are employed to statistically correlate the basic sound features of music audio tracks in the library with the measured emotional response of tested people to the music tracks in a training set and to blindly predict the emotional response from sound features in a test set. This study found that nonlinear methods are more robust and predictable but considerably more time-consuming than linear approaches. The neural networks have strong internal fittability but are associated with a significant overfitting issue. The support vector machine and Gaussian process exhibit both high internal stability and satisfactory external predictability in all used methods; they are considered as promising tools to model, predict, and explain the intrinsic relationship between human emotion and music. The psychological basis and perceptional implication underlying the built machine learning models are also discussed to find out the key music factors that affect human emotion.

人类情感是最复杂的心理生理现象之一，据报道，音乐对情感的影响很大。人们认为，人类情感与音乐之间存在着一种内在的关系，这种关系可以用一种监督的方式进行建模和定量预测。本文通过对大型免费音乐档案进行启发式聚类分析，得出了一个类型多样的音乐库，并使用标准协议测量了参与者的情绪反应，从而得出了一个系统的情绪-音乐概况。采用八种机器学习方法将库中音乐音轨的基本声音特征与被测者对训练集中音乐音轨的情绪反应进行统计关联，并从测试集中的声音特征中盲目预测情绪反应。本研究发现，非线性方法比线性方法更具鲁棒性和可预测性，但相当耗时。神经网络具有很强的内部拟合性，但存在明显的过拟合问题。支持向量机和高斯过程均表现出较高的内部稳定性和良好的外部可预测性;它们被认为是模拟、预测和解释人类情感与音乐之间内在关系的有前途的工具。本文还讨论了所建立的机器学习模型的心理基础和感知含义，以找出影响人类情感的关键音乐因素。

{"title":"Machine Learning–based Modeling and Prediction of the Intrinsic Relationship between Human Emotion and Music","authors":"Jun Su, Pengcheng Zhou","doi":"10.1145/3534966","DOIUrl":"https://doi.org/10.1145/3534966","url":null,"abstract":"Human emotion is one of the most complex psychophysiological phenomena and has been reported to be affected significantly by music listening. It is supposed that there is an intrinsic relationship between human emotion and music, which can be modeled and predicted quantitatively in a supervised manner. Here, a heuristic clustering analysis is carried out on large-scale free music archive to derive a genre-diverse music library, to which the emotional response of participants is measured using a standard protocol, consequently resulting in a systematic emotion-to-music profile. Eight machine learning methods are employed to statistically correlate the basic sound features of music audio tracks in the library with the measured emotional response of tested people to the music tracks in a training set and to blindly predict the emotional response from sound features in a test set. This study found that nonlinear methods are more robust and predictable but considerably more time-consuming than linear approaches. The neural networks have strong internal fittability but are associated with a significant overfitting issue. The support vector machine and Gaussian process exhibit both high internal stability and satisfactory external predictability in all used methods; they are considered as promising tools to model, predict, and explain the intrinsic relationship between human emotion and music. The psychological basis and perceptional implication underlying the built machine learning models are also discussed to find out the key music factors that affect human emotion.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134642873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating Realism in Example-based Terrain Synthesis 评估基于实例的地形合成中的真实感

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-05-02 DOI: 10.1145/3531526

Joshua J. Scott, N. Dodgson

We report two studies that investigate the use of subjective believability in the assessment of objective realism of terrain. The first demonstrates that there is a clear subjective feature bias that depends on the types of terrain being evaluated: Our participants found certain natural terrains to be more believable than others. This confounding factor means that any comparison experiment must not ask participants to compare terrains with different types of features. Our second experiment assesses four methods of example-based terrain synthesis, comparing them against each other and against real terrain. Our results show that, while all tested methods can produce terrain that is indistinguishable from reality, all also can produce poor terrain; that there is no one method that is consistently better than the others; and that those who have professional expertise in geology, cartography, or image analysis are better able to distinguish real terrain from synthesized terrain than the general population, but those who have professional expertise in the visual arts are not.

我们报告两项研究，调查使用主观可信度的评估客观现实的地形。第一项研究表明，存在明显的主观特征偏差，这取决于被评估的地形类型:我们的参与者发现某些自然地形比其他地形更可信。这个混杂因素意味着任何比较实验都不能要求参与者比较具有不同类型特征的地形。我们的第二个实验评估了四种基于实例的地形合成方法，将它们相互比较并与真实地形进行比较。结果表明，虽然所有测试方法都能产生与现实难以区分的地形，但也都能产生较差的地形;没有一种方法总是比其他方法更好;那些在地质学、制图学或图像分析方面有专业知识的人比一般人更能区分真实地形和合成地形，但那些在视觉艺术方面有专业知识的人则不然。

引用次数: 0

The Duration of an Auditory Icon Can Affect How the Listener Interprets Its Meaning 听觉图标的持续时间会影响听者对其含义的理解

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-03-28 DOI: 10.1145/3527269

João P. Cabral, G. Remijn

Initially introduced in the field of informatics, an auditory icon consists of a short sound that is present in everyday life, used to represent a specific event, object, function, or action. Auditory icons have been studied in various fields, and overall, compared to other types of auditory alarms, they can be very efficient in informing the listener about a situation or event. So far, auditory icons have been used with a wide range of durations, ranging from a few hundreds of milliseconds up to several seconds. Still little is known, however, about whether and how icon duration influences its interpretation. In the present study, we therefore asked listeners to rate 12 auditory icons, divided into four different sound categories (nonverbal human sounds, machine sounds, human activities, and animal vocalizations), in five different durations (200, 400, 800, 1,600, and 3,200 ms). They rated (1) how appropriately the icon sound itself represented the icon's referent and (2) how appropriately each duration of the icon sound represented the icon's referent. Overall, results demonstrate that the duration of the auditory icons in this stimulus set can directly affect how the icon represents the referent. Auditory icons in the test set characterized by human activities represented their referent most appropriately in a relatively shorter duration (400 or 800 ms). The majority of the auditory icons in the set consisting of machine sounds, nonverbal human sounds, and animal vocalizations, however, were considered as more appropriately representing their referent in longer durations (800 ms and 1,600 ms). Further systematic research is necessary to determine whether the duration effects shown here may generalize to other stimulus sets.

听觉图标最初是在信息学领域引入的，它由日常生活中出现的短声音组成，用于表示特定的事件、对象、功能或动作。听觉图标已经在各个领域进行了研究，总的来说，与其他类型的听觉警报相比，它们可以非常有效地告知听者有关情况或事件。到目前为止，听觉图标的持续时间范围很广，从几百毫秒到几秒钟不等。然而，关于图标持续时间是否以及如何影响其解释，我们仍然知之甚少。因此，在本研究中，我们要求听众在五种不同的持续时间(200,400,800,1,600和3,200毫秒)内对12个听觉图标进行评分，这些图标分为四种不同的声音类别(非语言人类声音，机器声音，人类活动和动物发声)。他们评估了(1)图标声音本身代表图标所指物的恰当程度;(2)图标声音的每个持续时间代表图标所指物的恰当程度。总体而言，结果表明听觉图标在刺激集中的持续时间可以直接影响图标对指称物的表征。在以人类活动为特征的测试集中，听觉图标在相对较短的持续时间(400或800 ms)内最恰当地代表了它们的所指。然而，在由机器声音、非语言人类声音和动物发声组成的集合中，大多数听觉图标被认为在更长的持续时间(800毫秒和1600毫秒)内更合适地代表它们的所指物。需要进一步的系统研究来确定这里显示的持续时间效应是否可以推广到其他刺激集。

{"title":"The Duration of an Auditory Icon Can Affect How the Listener Interprets Its Meaning","authors":"João P. Cabral, G. Remijn","doi":"10.1145/3527269","DOIUrl":"https://doi.org/10.1145/3527269","url":null,"abstract":"Initially introduced in the field of informatics, an auditory icon consists of a short sound that is present in everyday life, used to represent a specific event, object, function, or action. Auditory icons have been studied in various fields, and overall, compared to other types of auditory alarms, they can be very efficient in informing the listener about a situation or event. So far, auditory icons have been used with a wide range of durations, ranging from a few hundreds of milliseconds up to several seconds. Still little is known, however, about whether and how icon duration influences its interpretation. In the present study, we therefore asked listeners to rate 12 auditory icons, divided into four different sound categories (nonverbal human sounds, machine sounds, human activities, and animal vocalizations), in five different durations (200, 400, 800, 1,600, and 3,200 ms). They rated (1) how appropriately the icon sound itself represented the icon's referent and (2) how appropriately each duration of the icon sound represented the icon's referent. Overall, results demonstrate that the duration of the auditory icons in this stimulus set can directly affect how the icon represents the referent. Auditory icons in the test set characterized by human activities represented their referent most appropriately in a relatively shorter duration (400 or 800 ms). The majority of the auditory icons in the set consisting of machine sounds, nonverbal human sounds, and animal vocalizations, however, were considered as more appropriately representing their referent in longer durations (800 ms and 1,600 ms). Further systematic research is necessary to determine whether the duration effects shown here may generalize to other stimulus sets.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126975263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Immersive Properties of High Dynamic Range Video 高动态范围视频的沉浸性研究

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-03-28 DOI: 10.1145/3524692

S. Hinde, K. Noland, Graham Thomas, David R. Bull, I. Gilchrist

This paper presents the results from two studies which used a dual-task methodology to measure an audience's experience of immersion while watching video under typical television viewing conditions. Immersion was measured while participants watched either a high dynamic range, wide color gamut video or a standard dynamic range, standard color gamut video, in high definition or ultra-high definition. Other video parameters were carefully measured and controlled. The study found that high dynamic range, wide color gamut video is significantly more immersive than standard dynamic range, standard color gamut video in the chosen configuration. However, there was no evidence of significant differences in immersion between high-definition and ultra-high-definition resolutions.

本文介绍了两项研究的结果，这两项研究使用双任务方法来测量观众在典型电视观看条件下观看视频时的沉浸体验。当参与者观看高动态范围、宽色域的视频或标准动态范围、标准色域的高清或超高清视频时，浸入度被测量。其他视频参数被仔细测量和控制。研究发现，高动态范围、宽色域视频在选择配置时明显比标准动态范围、标准色域视频更具沉浸感。然而，没有证据表明高清和超高清分辨率之间的沉浸感有显著差异。

引用次数: 2

PTRM: Perceived Terrain Realism Metric 感知地形现实度度量

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-02-11 DOI: 10.1145/3514244

S. D. Rajasekaran, Hao Kang, Martin Čadík, Eric Galin, É. Guérin, A. Peytavie, P. Slavík, Bedrich Benes

Terrains are visually prominent and commonly needed objects in many computer graphics applications. While there are many algorithms for synthetic terrain generation, it is rather difficult to assess the realism of a generated output. This article presents a first step toward the direction of perceptual evaluation for terrain models. We gathered and categorized several classes of real terrains, and we generated synthetic terrain models using computer graphics methods. The terrain geometries were rendered by using the same texturing, lighting, and camera position. Two studies on these image sets were conducted, ranking the terrains perceptually, and showing that the synthetic terrains are perceived as lacking realism compared to the real ones. We provide insight into the features that affect the perceived realism by a quantitative evaluation based on localized geomorphology-based landform features (geomorphons) that categorize terrain structures such as valleys, ridges, hollows, and so forth. We show that the presence or absence of certain features has a significant perceptual effect. The importance and presence of the terrain features were confirmed by using a generative deep neural network that transferred the features between the geometric models of the real terrains and the synthetic ones. The feature transfer was followed by another perceptual experiment that further showed their importance and effect on perceived realism. We then introduce Perceived Terrain Realism Metrics (PTRM), which estimates human-perceived realism of a terrain represented as a digital elevation map by relating the distribution of terrain features with their perceived realism. This metric can be used on a synthetic terrain, and it will output an estimated level of perceived realism. We validated the proposed metrics on real and synthetic data and compared them to the perceptual studies.

地形在许多计算机图形应用程序中是视觉上突出的和通常需要的对象。虽然有许多算法用于合成地形生成，但很难评估生成的输出的真实感。本文向地形模型的感知评价方向迈出了第一步。我们收集并分类了几类真实地形，并使用计算机图形学方法生成了合成地形模型。地形几何图形通过使用相同的纹理、照明和相机位置来渲染。对这些图像集进行了两项研究，对地形进行了感知排序，并表明与真实地形相比，合成地形被认为缺乏真实感。我们通过基于局部地貌学的地貌特征(地貌学)的定量评估来深入了解影响感知真实感的特征，地貌学对地形结构(如山谷、山脊、洼地等)进行分类。我们表明，某些特征的存在或不存在具有显著的感知效应。利用生成式深度神经网络在真实地形几何模型和合成地形几何模型之间进行特征转换，确定了地形特征的重要性和存在性。特征转移之后进行了另一个感知实验，进一步显示了特征转移对感知真实性的重要性和影响。然后，我们引入了感知地形真实感度量(PTRM)，它通过将地形特征的分布与其感知真实感相关联来估计数字高程图中地形的人类感知真实感。这个指标可以用于合成地形，它将输出一个估计的感知现实主义水平。我们在真实和合成数据上验证了提出的度量标准，并将它们与感知研究进行了比较。

{"title":"PTRM: Perceived Terrain Realism Metric","authors":"S. D. Rajasekaran, Hao Kang, Martin Čadík, Eric Galin, É. Guérin, A. Peytavie, P. Slavík, Bedrich Benes","doi":"10.1145/3514244","DOIUrl":"https://doi.org/10.1145/3514244","url":null,"abstract":"Terrains are visually prominent and commonly needed objects in many computer graphics applications. While there are many algorithms for synthetic terrain generation, it is rather difficult to assess the realism of a generated output. This article presents a first step toward the direction of perceptual evaluation for terrain models. We gathered and categorized several classes of real terrains, and we generated synthetic terrain models using computer graphics methods. The terrain geometries were rendered by using the same texturing, lighting, and camera position. Two studies on these image sets were conducted, ranking the terrains perceptually, and showing that the synthetic terrains are perceived as lacking realism compared to the real ones. We provide insight into the features that affect the perceived realism by a quantitative evaluation based on localized geomorphology-based landform features (geomorphons) that categorize terrain structures such as valleys, ridges, hollows, and so forth. We show that the presence or absence of certain features has a significant perceptual effect. The importance and presence of the terrain features were confirmed by using a generative deep neural network that transferred the features between the geometric models of the real terrains and the synthetic ones. The feature transfer was followed by another perceptual experiment that further showed their importance and effect on perceived realism. We then introduce Perceived Terrain Realism Metrics (PTRM), which estimates human-perceived realism of a terrain represented as a digital elevation map by relating the distribution of terrain features with their perceived realism. This metric can be used on a synthetic terrain, and it will output an estimated level of perceived realism. We validated the proposed metrics on real and synthetic data and compared them to the perceptual studies.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114188951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Display-Size Dependent Effects of 3D Viewing on Subjective Impressions 3D观看对主观印象的显示尺寸依赖效应

ACM Transactions on Applied Perceptions (TAP)

Pub Date : 2022-01-26 DOI: 10.1145/3510461

Yamato Miyashita, Y. Sawahata, Akihiro Sakai, M. Harasawa, Kazuhiro Hara, T. Morita, K. Komine

This paper describes how the screen size of 3D displays affect the subjective impressions of 3D-visualized content. The key requirement for 3D displays is the presentation of depth cues comprising binocular disparities and/or motion parallax; however, the development of displays and production of content that include these cues leads to an increase in costs. Given the variety of screen sizes, it is expected that 3D characteristics are experienced differently by viewers depending on the screen size. We asked 48 participants to evaluate the 3D experience when using three different-sized stereoscopic displays (11.5, 55, and 200 inches) with head trackers. The participants were asked to score presented stimuli on 20 opposite-term pairs based on the semantic differential method after viewing each of six stimuli. Using factor analysis, we extracted three principal factors: power, related to strong three-dimensionality, real, etc.; visibility, related to stable, natural, etc.; and space, related to agile, open, etc., which had proportions of variances of 0.317, 0.277, and 0.251, respectively; their cumulation was 0.844. We confirmed that the three different-sized displays did not produce the same subjective impressions of the 3D characteristics. In particular, on the small-sized display, we found larger effects on power and space impressions from motion parallax (η2 = 0.133 and 0.161, respectively) than for the other two sizes. We found degradation of the visibility impressions from binocular disparities, which might be caused by artifacts from stereoscopy. The effects of 3D viewing on subjective impression depends on the display size, and small-sized displays offer the largest benefits by adding 3D characteristics to 2D visualization.

本文描述了3D显示器的屏幕尺寸如何影响3D可视化内容的主观印象。3D显示的关键要求是呈现深度线索，包括双眼差异和/或运动视差;然而，包含这些提示的显示和内容的开发会导致成本的增加。考虑到屏幕尺寸的多样性，预计观众对3D特征的体验会因屏幕尺寸的不同而不同。我们要求48名参与者在使用三种不同尺寸的立体显示器(11.5英寸、55英寸和200英寸)和头部追踪器时评估3D体验。研究人员要求被试在观看了6个刺激物后，根据语义差异法对20个相反的词对进行评分。利用因子分析，我们提取了三个主要因子:幂、关联度强、实时性强等;可见度，涉及稳定、自然等;空间与敏捷性、开放性等相关，方差比分别为0.317、0.277、0.251;其累积值为0.844。我们证实，三种不同尺寸的显示器不会产生相同的3D特征的主观印象。特别是，在小尺寸显示器上，我们发现运动视差(η2分别= 0.133和0.161)对功率和空间印象的影响大于其他两种尺寸。我们发现双眼视差造成的视觉印象退化，这可能是由立体视觉的伪影引起的。3D观看对主观印象的影响取决于显示尺寸，而小尺寸的显示通过在2D可视化中添加3D特征提供了最大的好处。

{"title":"Display-Size Dependent Effects of 3D Viewing on Subjective Impressions","authors":"Yamato Miyashita, Y. Sawahata, Akihiro Sakai, M. Harasawa, Kazuhiro Hara, T. Morita, K. Komine","doi":"10.1145/3510461","DOIUrl":"https://doi.org/10.1145/3510461","url":null,"abstract":"This paper describes how the screen size of 3D displays affect the subjective impressions of 3D-visualized content. The key requirement for 3D displays is the presentation of depth cues comprising binocular disparities and/or motion parallax; however, the development of displays and production of content that include these cues leads to an increase in costs. Given the variety of screen sizes, it is expected that 3D characteristics are experienced differently by viewers depending on the screen size. We asked 48 participants to evaluate the 3D experience when using three different-sized stereoscopic displays (11.5, 55, and 200 inches) with head trackers. The participants were asked to score presented stimuli on 20 opposite-term pairs based on the semantic differential method after viewing each of six stimuli. Using factor analysis, we extracted three principal factors: power, related to strong three-dimensionality, real, etc.; visibility, related to stable, natural, etc.; and space, related to agile, open, etc., which had proportions of variances of 0.317, 0.277, and 0.251, respectively; their cumulation was 0.844. We confirmed that the three different-sized displays did not produce the same subjective impressions of the 3D characteristics. In particular, on the small-sized display, we found larger effects on power and space impressions from motion parallax (η2 = 0.133 and 0.161, respectively) than for the other two sizes. We found degradation of the visibility impressions from binocular disparities, which might be caused by artifacts from stereoscopy. The effects of 3D viewing on subjective impression depends on the display size, and small-sized displays offer the largest benefits by adding 3D characteristics to 2D visualization.","PeriodicalId":285994,"journal":{"name":"ACM Transactions on Applied Perceptions (TAP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114420366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0