首页 > 最新文献

Proceedings of the 8th ACM on Multimedia Systems Conference最新文献

英文 中文
Fuji-chan: A unique IoT ambient display for monitoring Mount Fuji's conditions Fuji-chan:一个独特的物联网环境显示器,用于监测富士山的状况
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3083223
Paul Haimes, Tetsuaki Baba, Hiroya Suda, Kumiko Kushiyama
Fuji-chan is a simple ambient display device, which uses a wireless internet connection to monitor two important characteristics of Mount Fuji, Japan's largest mountain. This system utilises two internet-based data feeds to inform people of the weather conditions at the peak of the mountain, along with the current level of volcanic eruption risk. We consider the latter information in particular to be of great importance. These two data feeds are communicated via LEDs placed at the top and base of the device, along with aural output to indicate volcanic eruption warning levels. We also created a simple web interface for this information. By creating this device and application, we aim to reimagine how geospatial information can be presented, while also creating something which is visually appealing. Through the demonstration of this multimodal system, we also aim to promote the idea of an "Internet of Beautiful Things", where IOT technology is applied to interactive artworks.
Fuji-chan是一种简单的环境显示设备,它通过无线互联网连接来监测日本最大的山峰富士山的两个重要特征。该系统利用两个基于互联网的数据馈送,向人们通报山顶的天气状况,以及目前火山爆发的风险水平。我们认为后一种情况尤其重要。这两个数据输入通过放置在设备顶部和底部的led进行通信,同时还有声音输出来指示火山爆发警告级别。我们还为这些信息创建了一个简单的web界面。通过创建这个设备和应用程序,我们的目标是重新想象如何呈现地理空间信息,同时也创造一些视觉上吸引人的东西。通过这个多模态系统的演示,我们还旨在推广“物联网”的理念,将物联网技术应用于互动艺术品。
{"title":"Fuji-chan: A unique IoT ambient display for monitoring Mount Fuji's conditions","authors":"Paul Haimes, Tetsuaki Baba, Hiroya Suda, Kumiko Kushiyama","doi":"10.1145/3083187.3083223","DOIUrl":"https://doi.org/10.1145/3083187.3083223","url":null,"abstract":"Fuji-chan is a simple ambient display device, which uses a wireless internet connection to monitor two important characteristics of Mount Fuji, Japan's largest mountain. This system utilises two internet-based data feeds to inform people of the weather conditions at the peak of the mountain, along with the current level of volcanic eruption risk. We consider the latter information in particular to be of great importance. These two data feeds are communicated via LEDs placed at the top and base of the device, along with aural output to indicate volcanic eruption warning levels. We also created a simple web interface for this information. By creating this device and application, we aim to reimagine how geospatial information can be presented, while also creating something which is visually appealing. Through the demonstration of this multimodal system, we also aim to promote the idea of an \"Internet of Beautiful Things\", where IOT technology is applied to interactive artworks.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128741538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Fully Offloaded Cloud-based AR: Design, Implementation and Experience 面向完全卸载的基于云的AR:设计、实现和体验
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084012
R. Shea, Andy Sun, Silvery Fu, Jiangchuan Liu
Combining advanced sensors and powerful processing capabilities smart-phone based augmented reality (AR) is becoming increasingly prolific. The increase in prominence of these resource hungry AR applications poses significant challenges to energy constrained environments such as mobile-phones.; AB@To that end we present a platform for offloading AR applications to powerful cloud servers. We implement this system using a thin-client design and explore its performance using the real world application Pokemon Go as a case study. We show that with careful design a thin client is capable of offloading much of the AR processing to a cloud server, with the results being streamed back. Our initial experiments show substantial energy savings, low latency and excellent image quality even at relatively low bit-rates.
结合先进的传感器和强大的处理能力,基于智能手机的增强现实(AR)正变得越来越多。这些资源匮乏的增强现实应用日益突出,对手机等能源受限的环境构成了重大挑战。AB@To最后,我们提供了一个平台,将AR应用程序卸载到强大的云服务器上。我们使用瘦客户端设计来实现该系统,并以现实世界的应用程序Pokemon Go为例研究其性能。我们表明,通过精心设计,瘦客户端能够将大部分AR处理卸载到云服务器上,并将结果流式传输回来。我们的初步实验表明,即使在相对较低的比特率下,也可以节省大量能源,降低延迟和出色的图像质量。
{"title":"Towards Fully Offloaded Cloud-based AR: Design, Implementation and Experience","authors":"R. Shea, Andy Sun, Silvery Fu, Jiangchuan Liu","doi":"10.1145/3083187.3084012","DOIUrl":"https://doi.org/10.1145/3083187.3084012","url":null,"abstract":"Combining advanced sensors and powerful processing capabilities smart-phone based augmented reality (AR) is becoming increasingly prolific. The increase in prominence of these resource hungry AR applications poses significant challenges to energy constrained environments such as mobile-phones.; AB@To that end we present a platform for offloading AR applications to powerful cloud servers. We implement this system using a thin-client design and explore its performance using the real world application Pokemon Go as a case study. We show that with careful design a thin client is capable of offloading much of the AR processing to a cloud server, with the results being streamed back. Our initial experiments show substantial energy savings, low latency and excellent image quality even at relatively low bit-rates.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129612698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
DroneFace: An Open Dataset for Drone Research DroneFace:无人机研究的开放数据集
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3083214
Hwai-Jung Hsu, Kuan-Ta Chen
In this paper, we present DroneFace, an open dataset for testing how well face recognition can work on drones. Because of the high mobility, drones, i.e. unmanned aerial vehicles (UAVs), are appropriate for surveillance, daily patrol or seeking lost people on the streets, and thus need the capability of tracking human targets' faces from the air. Under this context, drones' distances and heights from the targets influence the accuracy of face recognition. In order to test whether a face recognition technique is suitable for drones, we establish DroneFace composed of facial images taken from various combinations of distances and heights for evaluating how a face recognition technique works in recognizing designated faces from the air. Since Face recognition is one of the most successful application in image analysis and understanding, and there exist many face recognition database for various purposes. To the best of our knowledge, DroneFace is the only dataset including facial images taken from controlled distances and heights within unconstrained environment, and can be valuable for future study of integrating face recognition techniques onto drones.
在本文中,我们提出了一个开放数据集,用于测试人脸识别在无人机上的工作效果。由于机动性高,无人机即无人驾驶飞行器(uav)适合用于监视、日常巡逻或在街道上寻找失踪人员,因此需要具备从空中跟踪人类目标面部的能力。在这种情况下,无人机与目标的距离和高度会影响人脸识别的准确性。为了测试人脸识别技术是否适用于无人机,我们建立了由各种距离和高度组合的面部图像组成的DroneFace,以评估人脸识别技术如何从空中识别指定的人脸。由于人脸识别是图像分析和理解中最成功的应用之一,并且存在许多用于各种用途的人脸识别数据库。据我们所知,DroneFace是唯一的数据集,包括在不受约束的环境中从受控距离和高度拍摄的面部图像,对于未来将面部识别技术集成到无人机上的研究很有价值。
{"title":"DroneFace: An Open Dataset for Drone Research","authors":"Hwai-Jung Hsu, Kuan-Ta Chen","doi":"10.1145/3083187.3083214","DOIUrl":"https://doi.org/10.1145/3083187.3083214","url":null,"abstract":"In this paper, we present DroneFace, an open dataset for testing how well face recognition can work on drones. Because of the high mobility, drones, i.e. unmanned aerial vehicles (UAVs), are appropriate for surveillance, daily patrol or seeking lost people on the streets, and thus need the capability of tracking human targets' faces from the air. Under this context, drones' distances and heights from the targets influence the accuracy of face recognition. In order to test whether a face recognition technique is suitable for drones, we establish DroneFace composed of facial images taken from various combinations of distances and heights for evaluating how a face recognition technique works in recognizing designated faces from the air. Since Face recognition is one of the most successful application in image analysis and understanding, and there exist many face recognition database for various purposes. To the best of our knowledge, DroneFace is the only dataset including facial images taken from controlled distances and heights within unconstrained environment, and can be valuable for future study of integrating face recognition techniques onto drones.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130112124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Modeling User Quality of Experience (QoE) through Position Discrepancy in Multi-Sensorial, Immersive, Collaborative Environments 多感官、沉浸式、协同环境中基于位置差异的用户体验质量建模
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084018
Shanthi Vellingiri, Prabhakaran Balakrishnan
Users' QoE (Quality of Experience) in Multi-sensorial, Immersive, Collaborative Environments (MICE) applications is mostly measured by psychometric studies. These studies provide a subjective insight into the performance of such applications. In this paper, we hypothesize that spatial coherence or the lack of it of the embedded virtual objects among users has a correlation to the QoE in MICE. We use Position Discrepancy (PD) to model this lack of spatial coherence in MICE. Based on that, we propose a Hierarchical Position Discrepancy Model (HPDM) that computes PD at multiple levels to derive the application/system-level PD as a measure of performance.; AB@Experimental results on an example task in MICE show that HPDM can objectively quantify the application performance and has a correlation to the psychometric study-based QoE measurements. We envisage HPDM can provide more insight on the MICE application without the need for extensive user study.
在多感官、沉浸式、协作环境(MICE)应用中,用户的体验质量(QoE)主要是通过心理测量学研究来测量的。这些研究提供了对此类应用程序性能的主观见解。在本文中,我们假设用户之间嵌入的虚拟对象的空间相干性或缺乏空间相干性与MICE中的QoE相关。我们使用位置差异(PD)来模拟小鼠中这种空间相干性的缺乏。在此基础上,我们提出了一种分层位置差异模型(HPDM),该模型在多个层次上计算PD,从而得出应用/系统级PD作为性能度量。AB@Experimental在一个MICE示例任务上的结果表明,HPDM可以客观地量化应用绩效,并与基于心理测量学的QoE测量具有相关性。我们设想HPDM可以在不需要广泛的用户研究的情况下提供更多关于MICE应用的见解。
{"title":"Modeling User Quality of Experience (QoE) through Position Discrepancy in Multi-Sensorial, Immersive, Collaborative Environments","authors":"Shanthi Vellingiri, Prabhakaran Balakrishnan","doi":"10.1145/3083187.3084018","DOIUrl":"https://doi.org/10.1145/3083187.3084018","url":null,"abstract":"Users' QoE (Quality of Experience) in Multi-sensorial, Immersive, Collaborative Environments (MICE) applications is mostly measured by psychometric studies. These studies provide a subjective insight into the performance of such applications. In this paper, we hypothesize that spatial coherence or the lack of it of the embedded virtual objects among users has a correlation to the QoE in MICE. We use Position Discrepancy (PD) to model this lack of spatial coherence in MICE. Based on that, we propose a Hierarchical Position Discrepancy Model (HPDM) that computes PD at multiple levels to derive the application/system-level PD as a measure of performance.; AB@Experimental results on an example task in MICE show that HPDM can objectively quantify the application performance and has a correlation to the psychometric study-based QoE measurements. We envisage HPDM can provide more insight on the MICE application without the need for extensive user study.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127807694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Navigable Videos for Presenting Scientific Data on Affordable Head-Mounted Displays 在可负担得起的头戴式显示器上呈现科学数据的可导航视频
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084015
J. Chu, Chris Bryan, Min Shih, Leonardo Ferrer, K. Ma
Immersive, stereoscopic visualization enables scientists to better analyze structural and physical phenomena compared to traditional display mediums. Unfortunately, current head-mounted displays (HMDs) with the high rendering quality necessary for these complex datasets are prohibitively expensive, especially in educational settings where their high cost makes it impractical to buy several devices. To address this problem, we develop two tools: (1) An authoring tool allows domain scientists to generate a set of connected, 360° video paths for traversing between dimensional keyframes in the dataset. (2) A corresponding navigational interface is a video selection and playback tool that can be paired with a low-cost HMD to enable an interactive, non-linear, storytelling experience. We demonstrate the authoring tool's utility by conducting several case studies and assess the navigational interface with a usability study. Results show the potential of our approach in effectively expanding the accessibility of high-quality, immersive visualization to a wider audience using affordable HMDs.
与传统的显示媒介相比,身临其境的立体可视化使科学家能够更好地分析结构和物理现象。不幸的是,当前具有这些复杂数据集所需的高渲染质量的头戴式显示器(hmd)非常昂贵,特别是在教育环境中,它们的高成本使得购买几个设备变得不切实际。为了解决这个问题,我们开发了两个工具:(1)一个创作工具允许领域科学家生成一组连接的360°视频路径,用于在数据集中的维度关键帧之间遍历。(2)对应的导航界面是视频选择和播放工具,可以与低成本的头戴式显示器配对,实现交互式、非线性、讲故事的体验。我们通过进行几个案例研究,并通过可用性研究评估导航界面,来演示创作工具的实用性。结果表明,我们的方法有潜力有效地将高质量、沉浸式可视化的可访问性扩展到使用价格合理的头显的更广泛受众。
{"title":"Navigable Videos for Presenting Scientific Data on Affordable Head-Mounted Displays","authors":"J. Chu, Chris Bryan, Min Shih, Leonardo Ferrer, K. Ma","doi":"10.1145/3083187.3084015","DOIUrl":"https://doi.org/10.1145/3083187.3084015","url":null,"abstract":"Immersive, stereoscopic visualization enables scientists to better analyze structural and physical phenomena compared to traditional display mediums. Unfortunately, current head-mounted displays (HMDs) with the high rendering quality necessary for these complex datasets are prohibitively expensive, especially in educational settings where their high cost makes it impractical to buy several devices. To address this problem, we develop two tools: (1) An authoring tool allows domain scientists to generate a set of connected, 360° video paths for traversing between dimensional keyframes in the dataset. (2) A corresponding navigational interface is a video selection and playback tool that can be paired with a low-cost HMD to enable an interactive, non-linear, storytelling experience. We demonstrate the authoring tool's utility by conducting several case studies and assess the navigational interface with a usability study. Results show the potential of our approach in effectively expanding the accessibility of high-quality, immersive visualization to a wider audience using affordable HMDs.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125146979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Hyperion: A Wearable Augmented Reality System for Text Extraction and Manipulation in the Air Hyperion:用于空中文本提取和操作的可穿戴增强现实系统
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084017
Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, Arailym Butabayeva, Rui Zheng, Morteza Golkarifard, P. Hui
We develop Hyperion a Wearable Augmented Reality (WAR) system based on Google Glass to access text information in the ambient environment. Hyperion is able to retrieve text content from users' current view and deliver the content to them in different ways according to their context. We design four work modalities for different situations that mobile users encounter in their daily activities. In addition, user interaction interfaces are provided to adapt to different application scenarios. Although Google Glass may be constrained by its poor computational capabilities and its limited battery capacity, we utilize code-level offloading to companion mobile devices to improve the runtime performance and the sustainability of WAR applications. System experiments show that Hyperion improves users ability to be aware of text information around them. Our prototype indicates promising potential of converging WAR technology and wearable devices such as Google Glass to improve people's daily activities.
我们开发了Hyperion一个基于谷歌眼镜的可穿戴增强现实(WAR)系统,以访问周围环境中的文本信息。Hyperion能够从用户当前视图中检索文本内容,并根据他们的上下文以不同的方式将内容交付给他们。我们为移动用户在日常活动中遇到的不同情况设计了四种工作模式。此外,还提供了用户交互界面,以适应不同的应用场景。虽然Google Glass可能会受到其计算能力差和电池容量有限的限制,但我们利用代码级卸载来伴随移动设备,以提高WAR应用程序的运行时性能和可持续性。系统实验表明,Hyperion提高了用户对周围文本信息的感知能力。我们的原型表明,将战争技术与谷歌眼镜等可穿戴设备融合在一起,改善人们的日常活动,具有很大的潜力。
{"title":"Hyperion: A Wearable Augmented Reality System for Text Extraction and Manipulation in the Air","authors":"Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, Arailym Butabayeva, Rui Zheng, Morteza Golkarifard, P. Hui","doi":"10.1145/3083187.3084017","DOIUrl":"https://doi.org/10.1145/3083187.3084017","url":null,"abstract":"We develop Hyperion a Wearable Augmented Reality (WAR) system based on Google Glass to access text information in the ambient environment. Hyperion is able to retrieve text content from users' current view and deliver the content to them in different ways according to their context. We design four work modalities for different situations that mobile users encounter in their daily activities. In addition, user interaction interfaces are provided to adapt to different application scenarios. Although Google Glass may be constrained by its poor computational capabilities and its limited battery capacity, we utilize code-level offloading to companion mobile devices to improve the runtime performance and the sustainability of WAR applications. System experiments show that Hyperion improves users ability to be aware of text information around them. Our prototype indicates promising potential of converging WAR technology and wearable devices such as Google Glass to improve people's daily activities.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127227701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Recognition of Easily-confused TCM Herbs Using Deep Learning 利用深度学习识别易混淆的中药
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3083226
Juei-Chun Weng, Min-Chun Hu, Kun-Chan Lan
Chinese herbal medicine (CHM) plays an important role of treatment in traditional Chinese medicine (TCM). Traditionally, CHM is used to restore the balance of the body for sick people and maintain health for common people. However, lack of the knowledge of the herbs may cause misuse of the herbs. In this demo, we will present a real-time smartphone application, which can not only recognize easily-confused herb based on Convolutional Neural Network (CNN), but also provide relevant information about the detected herbs. Our Chinese herb recognition system is implemented on a cloud server and can be used by the client user via smartphone. The recognition system is evaluated by 5-fold cross validation method and the accuracy is around 96%, which is adequate for real-world use.
中草药在中医中起着重要的治疗作用。传统上,中医被用来为病人恢复身体平衡,为普通人保持健康。然而,缺乏草药知识可能会导致滥用草药。在本演示中,我们将展示一个实时智能手机应用程序,该应用程序不仅可以基于卷积神经网络(CNN)识别容易混淆的草药,还可以提供检测到的草药的相关信息。我们的中药识别系统是在云服务器上实现的,客户端用户可以通过智能手机使用。该识别系统采用5重交叉验证法进行评估,准确率在96%左右,足以满足实际应用。
{"title":"Recognition of Easily-confused TCM Herbs Using Deep Learning","authors":"Juei-Chun Weng, Min-Chun Hu, Kun-Chan Lan","doi":"10.1145/3083187.3083226","DOIUrl":"https://doi.org/10.1145/3083187.3083226","url":null,"abstract":"Chinese herbal medicine (CHM) plays an important role of treatment in traditional Chinese medicine (TCM). Traditionally, CHM is used to restore the balance of the body for sick people and maintain health for common people. However, lack of the knowledge of the herbs may cause misuse of the herbs. In this demo, we will present a real-time smartphone application, which can not only recognize easily-confused herb based on Convolutional Neural Network (CNN), but also provide relevant information about the detected herbs. Our Chinese herb recognition system is implemented on a cloud server and can be used by the client user via smartphone. The recognition system is evaluated by 5-fold cross validation method and the accuracy is around 96%, which is adequate for real-world use.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Visual Latency Estimator for 3D Tele-Immersion 三维远程沉浸视觉延迟估计
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084019
S. Raghuraman, K. Bahirat, B. Prabhakaran
3D Tele-Immersion systems allow geographically distributed users to interact in a virtual world using their "live" 3D models. The capture, reconstruction, transfer, and rendering of these models introduce significant latency into the system. Implicit Latency (ℒ') can be estimated using system clocks to measure the time after the data was received from the RGB-D camera, till the request to render the result. The Observed Latency (ℒ) between a real world event and the event being rendered on the display, cannot be accurately represented by ℒ' since ℒ' ignores the time taken to capture, or update the display, etc. In this paper, a Visual Pattern based Latency Estimation (VPLE) approach is introduced to calculate the real world visual latency of a system without the need for any custom hardware. VPLE generates a constantly changing pattern that is captured and rendered by the 3DTI system. An external observer records both the pattern and the rendered results at high frame rates. ℒ is estimated by calculating the difference between the generated and rendered patterns. VPLE is extended to allow ℒ estimation between geographically distributed sites. Evaluations show that the accuracy of VPLE depends on the refresh rate of the pattern, and is within 4ms. ℒ of a distributed 3DTI system implemented on the GPU is significantly lower than the CPU implementation, and is comparable to video streaming. It is also shown that the ℒ' estimates for GPU based 3DTI implementations are off by almost 100% compared to the ℒ.
3D远程沉浸系统允许地理位置分散的用户使用他们的“实时”3D模型在虚拟世界中进行交互。这些模型的捕获、重建、传输和呈现给系统带来了显著的延迟。隐式延迟(Implicit Latency,∑)可以用系统时钟来估计,测量从RGB-D相机接收到数据到请求渲染结果的时间。从一个真实世界的事件到被呈现在显示器上的事件之间的观察到的延迟(观察到的延迟),不能用__'来精确地表示,因为__'忽略了捕捉或更新显示器等所花费的时间。本文介绍了一种基于视觉模式的延迟估计(VPLE)方法,该方法可以在不需要任何自定义硬件的情况下计算系统的真实世界视觉延迟。VPLE生成一个不断变化的模式,由3DTI系统捕获和渲染。外部观察者以高帧率记录模式和渲染结果。通过计算生成和渲染模式之间的差异来估计。扩展了VPLE以允许在地理分布的站点之间进行估算。评估表明,VPLE的准确性取决于模式的刷新率,并且在4ms以内。在GPU上实现的分布式3DTI系统的运行速度明显低于CPU,与视频流相当。它还表明,与基于GPU的3DTI实现相比,它的估计几乎偏离了100%。
{"title":"A Visual Latency Estimator for 3D Tele-Immersion","authors":"S. Raghuraman, K. Bahirat, B. Prabhakaran","doi":"10.1145/3083187.3084019","DOIUrl":"https://doi.org/10.1145/3083187.3084019","url":null,"abstract":"3D Tele-Immersion systems allow geographically distributed users to interact in a virtual world using their \"live\" 3D models. The capture, reconstruction, transfer, and rendering of these models introduce significant latency into the system. Implicit Latency (ℒ') can be estimated using system clocks to measure the time after the data was received from the RGB-D camera, till the request to render the result. The Observed Latency (ℒ) between a real world event and the event being rendered on the display, cannot be accurately represented by ℒ' since ℒ' ignores the time taken to capture, or update the display, etc. In this paper, a Visual Pattern based Latency Estimation (VPLE) approach is introduced to calculate the real world visual latency of a system without the need for any custom hardware. VPLE generates a constantly changing pattern that is captured and rendered by the 3DTI system. An external observer records both the pattern and the rendered results at high frame rates. ℒ is estimated by calculating the difference between the generated and rendered patterns. VPLE is extended to allow ℒ estimation between geographically distributed sites. Evaluations show that the accuracy of VPLE depends on the refresh rate of the pattern, and is within 4ms. ℒ of a distributed 3DTI system implemented on the GPU is significantly lower than the CPU implementation, and is comparable to video streaming. It is also shown that the ℒ' estimates for GPU based 3DTI implementations are off by almost 100% compared to the ℒ.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127583152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP: Design, Implementation, and Evaluation 基于HTTP的全向视频高效带宽自适应流:设计、实现和评估
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3084016
M. Graf, C. Timmerer, Christopher Müller
Real-time entertainment services such as streaming audiovisual content deployed over the open, unmanaged Internet account now for more than 70% during peak periods. More and more such bandwidth hungry applications and services are proposed like immersive media services such as virtual reality and, specifically omnidirectional/360-degree videos. The adaptive streaming of omnidirectional video over HTTP imposes an important challenge on today's video delivery infrastructures which calls for dedicated, thoroughly designed techniques for content generation, delivery, and consumption.; AB@This paper describes the usage of tiles --- as specified within modern video codecs such HEVC/H.265 and VP9 --- enabling bandwidth efficient adaptive streaming of omnidirectional video over HTTP and we define various streaming strategies. Therefore, the parameters and characteristics of a dataset for omnidirectional video are proposed and exemplary instantiated to evaluate various aspects of such an ecosystem, namely bitrate overhead, bandwidth requirements, and quality aspects in terms of viewport PSNR. The results indicate bitrate savings from 40% (in a realistic scenario with recorded head movements from real users) up to 65% (in an ideal scenario with a centered/fixed viewport) and serve as a baseline and guidelines for advanced techniques including the outline of a research roadmap for the near future.
实时娱乐服务,如在开放、无管理的互联网上部署的流媒体视听内容,目前在高峰时期占70%以上。越来越多的带宽饥渴型应用和服务被提出,比如沉浸式媒体服务,比如虚拟现实,特别是全向/360度视频。基于HTTP的自适应全向视频流对当今的视频传输基础设施提出了重大挑战,这需要专门的、彻底设计的内容生成、传输和消费技术。AB@This论文描述了瓷砖的使用-在现代视频编解码器中指定,如HEVC/H。265和VP9——通过HTTP实现带宽高效的自适应全向视频流,我们定义了各种流策略。因此,本文提出了全向视频数据集的参数和特征,并举例说明了评估这种生态系统的各个方面,即比特率开销、带宽要求和视口PSNR方面的质量方面。结果表明比特率从40%(在真实用户记录头部运动的现实场景中)节省到65%(在具有中心/固定视口的理想场景中),并作为先进技术的基线和指导方针,包括近期研究路线图的大纲。
{"title":"Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP: Design, Implementation, and Evaluation","authors":"M. Graf, C. Timmerer, Christopher Müller","doi":"10.1145/3083187.3084016","DOIUrl":"https://doi.org/10.1145/3083187.3084016","url":null,"abstract":"Real-time entertainment services such as streaming audiovisual content deployed over the open, unmanaged Internet account now for more than 70% during peak periods. More and more such bandwidth hungry applications and services are proposed like immersive media services such as virtual reality and, specifically omnidirectional/360-degree videos. The adaptive streaming of omnidirectional video over HTTP imposes an important challenge on today's video delivery infrastructures which calls for dedicated, thoroughly designed techniques for content generation, delivery, and consumption.; AB@This paper describes the usage of tiles --- as specified within modern video codecs such HEVC/H.265 and VP9 --- enabling bandwidth efficient adaptive streaming of omnidirectional video over HTTP and we define various streaming strategies. Therefore, the parameters and characteristics of a dataset for omnidirectional video are proposed and exemplary instantiated to evaluate various aspects of such an ecosystem, namely bitrate overhead, bandwidth requirements, and quality aspects in terms of viewport PSNR. The results indicate bitrate savings from 40% (in a realistic scenario with recorded head movements from real users) up to 65% (in an ideal scenario with a centered/fixed viewport) and serve as a baseline and guidelines for advanced techniques including the outline of a research roadmap for the near future.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124686287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 189
A Dataset of Head and Eye Movements for 360 Degree Images 360度图像的头部和眼睛运动数据集
Pub Date : 2017-06-20 DOI: 10.1145/3083187.3083218
Yashas Rai, Jesús Gutiérrez, P. Callet
Understanding how observers watch visual stimuli like Images and Videos has helped the multimedia encoding, transmission, quality assessment and rendering communities immensely, to learn the regions important to an observer and provide to him/her an optimum quality of experience. The problem is even more paramount in case of 360 degree stimuli considering that most/a part of the content might not be seen by the observers at all, while other regions maybe extraordinarily important. Attention studies in this area has however been missing, mainly due to the lack of a dataset and guidelines to evaluate and compare visual attention/saliency in such scenarios. In this work, we present a dataset of sixty different 360 degree images, each watched by at-least 40 observers. Additionally, we also provide guidelines and tools to the community regarding the procedure to evaluate and compare saliency in omni-directional images. Some basic image/ observer agnostic viewing characteristics, like variation of exploration strategies with time and expertise, and also the effect of eye-movement within the view-port are explored. The dataset and tools are made available for free use by the community and is expected to promote Reproducible Research for all future work on computational modeling of attention in 360 scenarios.
了解观察者如何观看图像和视频等视觉刺激,极大地帮助了多媒体编码、传输、质量评估和渲染社区,以了解对观察者重要的区域,并为他/她提供最佳质量的体验。在360度刺激的情况下,考虑到大多数/部分内容可能根本无法被观察者看到,而其他区域可能非常重要,这个问题甚至更为重要。然而,这一领域的注意力研究一直缺失,主要是因为缺乏数据集和指南来评估和比较这种情况下的视觉注意力/显著性。在这项工作中,我们展示了60个不同的360度图像的数据集,每个图像至少由40个观察者观看。此外,我们还为社区提供有关评估和比较全向图像显着性的过程的指南和工具。探讨了一些基本的图像/观察者不可知的观看特征,如探索策略随时间和专业知识的变化,以及视口内眼球运动的影响。数据集和工具可供社区免费使用,并有望促进所有未来360场景下注意力计算建模工作的可重复性研究。
{"title":"A Dataset of Head and Eye Movements for 360 Degree Images","authors":"Yashas Rai, Jesús Gutiérrez, P. Callet","doi":"10.1145/3083187.3083218","DOIUrl":"https://doi.org/10.1145/3083187.3083218","url":null,"abstract":"Understanding how observers watch visual stimuli like Images and Videos has helped the multimedia encoding, transmission, quality assessment and rendering communities immensely, to learn the regions important to an observer and provide to him/her an optimum quality of experience. The problem is even more paramount in case of 360 degree stimuli considering that most/a part of the content might not be seen by the observers at all, while other regions maybe extraordinarily important. Attention studies in this area has however been missing, mainly due to the lack of a dataset and guidelines to evaluate and compare visual attention/saliency in such scenarios. In this work, we present a dataset of sixty different 360 degree images, each watched by at-least 40 observers. Additionally, we also provide guidelines and tools to the community regarding the procedure to evaluate and compare saliency in omni-directional images. Some basic image/ observer agnostic viewing characteristics, like variation of exploration strategies with time and expertise, and also the effect of eye-movement within the view-port are explored. The dataset and tools are made available for free use by the community and is expected to promote Reproducible Research for all future work on computational modeling of attention in 360 scenarios.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126789858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 170
期刊
Proceedings of the 8th ACM on Multimedia Systems Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1