Proceedings of the 9th ACM Multimedia Systems Conference最新文献

英文中文

Blind image quality assessment based on multiscale salient local binary patterns 基于多尺度显著局部二值模式的图像质量盲评价

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204960

P. Freitas, Sana Alamgeer, W. Y. L. Akamine, Mylène C. Q. Farias

Due to the rapid development of multimedia technologies, over the last decades image quality assessment (IQA) has become an important topic. As a consequence, a great research effort has been made to develop computational models that estimate image quality. Among the possible IQA approaches, blind IQA (BIQA) is of fundamental interest as it can be used in most multimedia applications. BIQA techniques measure the perceptual quality of an image without using the reference (or pristine) image. This paper proposes a new BIQA method that uses a combination of texture features and saliency maps of an image. Texture features are extracted from the images using the local binary pattern (LBP) operator at multiple scales. To extract the salient of an image, i.e. the areas of the image that are the main attractors of the viewers' attention, we use computational visual attention models that output saliency maps. These saliency maps can be used as weighting functions for the LBP maps at multiple scales. We propose an operator that produces a combination of multiscale LBP maps and saliency maps, which is called the multiscale salient local binary pattern (MSLBP) operator. To define which is the best model to be used in the proposed operator, we investigate the performance of several saliency models. Experimental results demonstrate that the proposed method is able to estimate the quality of impaired images with a wide variety of distortions. The proposed metric has a better prediction accuracy than state-of-the-art IQA methods.

近几十年来，随着多媒体技术的飞速发展，图像质量评价(IQA)已成为一个重要的研究课题。因此，研究人员付出了巨大的努力来开发估计图像质量的计算模型。在各种可能的IQA方法中，盲IQA (BIQA)是最重要的，因为它可以用于大多数多媒体应用。BIQA技术在不使用参考(或原始)图像的情况下测量图像的感知质量。本文提出了一种结合图像纹理特征和显著性映射的BIQA方法。采用局部二值模式(LBP)算子在多尺度下提取图像纹理特征。为了提取图像的显著性，即图像中吸引观众注意力的主要区域，我们使用计算视觉注意力模型输出显著性图。这些显著性图可以用作多个尺度下LBP图的加权函数。我们提出了一种生成多尺度显著性映射和显著性映射组合的算子，称为多尺度显著局部二元模式算子(MSLBP)。为了确定哪个是最好的模型，我们研究了几个显著性模型的性能。实验结果表明，该方法能够有效地估计各种畸变的受损图像的质量。所提出的度量比最先进的IQA方法具有更好的预测精度。

{"title":"Blind image quality assessment based on multiscale salient local binary patterns","authors":"P. Freitas, Sana Alamgeer, W. Y. L. Akamine, Mylène C. Q. Farias","doi":"10.1145/3204949.3204960","DOIUrl":"https://doi.org/10.1145/3204949.3204960","url":null,"abstract":"Due to the rapid development of multimedia technologies, over the last decades image quality assessment (IQA) has become an important topic. As a consequence, a great research effort has been made to develop computational models that estimate image quality. Among the possible IQA approaches, blind IQA (BIQA) is of fundamental interest as it can be used in most multimedia applications. BIQA techniques measure the perceptual quality of an image without using the reference (or pristine) image. This paper proposes a new BIQA method that uses a combination of texture features and saliency maps of an image. Texture features are extracted from the images using the local binary pattern (LBP) operator at multiple scales. To extract the salient of an image, i.e. the areas of the image that are the main attractors of the viewers' attention, we use computational visual attention models that output saliency maps. These saliency maps can be used as weighting functions for the LBP maps at multiple scales. We propose an operator that produces a combination of multiscale LBP maps and saliency maps, which is called the multiscale salient local binary pattern (MSLBP) operator. To define which is the best model to be used in the proposed operator, we investigate the performance of several saliency models. Experimental results demonstrate that the proposed method is able to estimate the quality of impaired images with a wide variety of distortions. The proposed metric has a better prediction accuracy than state-of-the-art IQA methods.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115412906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Cardea Cardea

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204973

Jiayu Shu, Rui Zheng, P. Hui

The growing popularity of mobile and wearable devices with built-in cameras and social media sites are now threatening people's visual privacy. Motivated by recent user studies that people's visual privacy concerns are closely related to context, we propose Cardea, a context-aware visual privacy protection mechanism that protects people's visual privacy in photos according to their privacy preferences. We define four context elements in a photo, including location, scene, others' presences, and hand gestures. Users can specify their context-dependent privacy preferences based on the above four elements. Cardea will offer fine-grained visual privacy protection service to those who request protection using their identifiable information. We present how Cardea can be integrated into: a) privacy-protecting camera apps, where captured photos will be processed before being saved locally; and b) online social media and networking sites, where uploaded photos will first be examined to protect individuals' visual privacy, before they become visible to others. Our evaluation results on an implemented prototype demonstrate that Cardea is effective with 86% overall accuracy and is welcomed by users, showing promising future of context-aware visual privacy protection for photo taking and sharing.

{"title":"Cardea","authors":"Jiayu Shu, Rui Zheng, P. Hui","doi":"10.1145/3204949.3204973","DOIUrl":"https://doi.org/10.1145/3204949.3204973","url":null,"abstract":"The growing popularity of mobile and wearable devices with built-in cameras and social media sites are now threatening people's visual privacy. Motivated by recent user studies that people's visual privacy concerns are closely related to context, we propose Cardea, a context-aware visual privacy protection mechanism that protects people's visual privacy in photos according to their privacy preferences. We define four context elements in a photo, including location, scene, others' presences, and hand gestures. Users can specify their context-dependent privacy preferences based on the above four elements. Cardea will offer fine-grained visual privacy protection service to those who request protection using their identifiable information. We present how Cardea can be integrated into: a) privacy-protecting camera apps, where captured photos will be processed before being saved locally; and b) online social media and networking sites, where uploaded photos will first be examined to protect individuals' visual privacy, before they become visible to others. Our evaluation results on an implemented prototype demonstrate that Cardea is effective with 86% overall accuracy and is welcomed by users, showing promising future of context-aware visual privacy protection for photo taking and sharing.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":" 39","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120829909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

A canadian french emotional speech dataset 加拿大法语情感语言数据集

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208121

P. Gournay, Olivier Lahaie, R. Lefebvre

Until recently, there was no emotional speech dataset available in Canadian French. This was a limiting factor for research activities not only in Canada, but also elsewhere. This paper introduces the newly released Canadian French Emotional (CaFE) speech dataset and gives details about its design and content. This dataset contains six different sentences, pronounced by six male and six female actors, in six basic emotions plus one neutral emotion. The six basic emotions are acted in two different intensities. The audio is digitally recorded at high-resolution (192 kHz sampling rate, 24 bits per sample). This new dataset is freely available under a Creative Commons license (CC BY-NC-SA 4.0).

直到最近，还没有加拿大法语的情感语音数据集。这不仅是加拿大研究活动的限制因素，也是其他地方研究活动的限制因素。本文介绍了新发布的加拿大法语情感(CaFE)语音数据集，并详细介绍了其设计和内容。该数据集包含六种不同的句子，由六名男性和六名女性演员以六种基本情绪和一种中性情绪发音。这六种基本情绪表现为两种不同的强度。音频以高分辨率(192khz采样率，每个采样24位)的数字记录。这个新数据集在知识共享许可(CC BY-NC-SA 4.0)下免费提供。

引用次数: 32

Foveated streaming of virtual reality videos 虚拟现实视频的注视点流

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208114

Miguel Fabián Romero Rondón, L. Sassatelli, F. Precioso, R. Aparicio-Pardo

While Virtual Reality (VR) represents a revolution in the user experience, current VR systems are flawed on different aspects. The difficulty to focus naturally in current headsets incurs visual discomfort and cognitive overload, while high-end headsets require tethered powerful hardware for scene synthesis. One of the major solutions envisioned to address these problems is foveated rendering. We consider the problem of streaming stored 360° videos to a VR headset equipped with eye-tracking and foveated rendering capabilities. Our end research goal is to make high-performing foveated streaming systems allowing the playback buffer to build up to absorb the network variations, which is permitted in none of the current proposals. We present our foveated streaming prototype based on the FOVE, one of the first commercially available headsets with an integrated eye-tracker. We build on the FOVE's Unity API to design a gaze-adaptive streaming system using one low- and one high-resolution segment from which the foveal region is cropped with per-frame filters. The low- and high-resolution frames are then merged at the client to approach the natural focusing process.

虽然虚拟现实(VR)代表了用户体验的一场革命，但目前的VR系统在不同方面都存在缺陷。目前的头戴式耳机难以自然对焦，导致视觉不适和认知超载，而高端头戴式耳机需要强大的硬件来进行场景合成。解决这些问题的主要解决方案之一是注视点渲染。我们考虑将存储的360°视频流式传输到具有眼动追踪和注视点渲染功能的VR头显的问题。我们的最终研究目标是制造高性能的注视点流系统，允许回放缓冲区建立以吸收网络变化，这在当前的建议中都不允许。我们展示了基于FOVE的注视点流原型，FOVE是首批商用耳机中的集成眼动仪之一。我们基于FOVE的Unity API设计了一个视线自适应流系统，使用一个低分辨率和一个高分辨率的片段，其中中央凹区域用每帧过滤器裁剪。然后在客户端合并低分辨率和高分辨率帧，以接近自然聚焦过程。

{"title":"Foveated streaming of virtual reality videos","authors":"Miguel Fabián Romero Rondón, L. Sassatelli, F. Precioso, R. Aparicio-Pardo","doi":"10.1145/3204949.3208114","DOIUrl":"https://doi.org/10.1145/3204949.3208114","url":null,"abstract":"While Virtual Reality (VR) represents a revolution in the user experience, current VR systems are flawed on different aspects. The difficulty to focus naturally in current headsets incurs visual discomfort and cognitive overload, while high-end headsets require tethered powerful hardware for scene synthesis. One of the major solutions envisioned to address these problems is foveated rendering. We consider the problem of streaming stored 360° videos to a VR headset equipped with eye-tracking and foveated rendering capabilities. Our end research goal is to make high-performing foveated streaming systems allowing the playback buffer to build up to absorb the network variations, which is permitted in none of the current proposals. We present our foveated streaming prototype based on the FOVE, one of the first commercially available headsets with an integrated eye-tracker. We build on the FOVE's Unity API to design a gaze-adaptive streaming system using one low- and one high-resolution segment from which the foveal region is cropped with per-frame filters. The low- and high-resolution frames are then merged at the client to approach the natural focusing process.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121948433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Open video datasets over operational mobile networks with MONROE 使用MONROE在操作移动网络上开放视频数据集

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208138

Cise Midoglu, Mohamed Moulay, V. Mancuso, Özgü Alay, Andra Lutu, C. Griwodz

Video streaming is a very popular service among the end-users of Mobile Broadband (MBB) networks. DASH and WebRTC are two key technologies in the delivery of mobile video. In this work, we empirically assess the performance of video streaming with DASH and WebRTC in operational MBB networks, by using a large number of programmable network probes spread over several countries in the context of the MONROE project. We collect a large dataset from more than 300 video streaming experiments. Our dataset consists of network traces, performance indicators captured during the streaming sessions, and experiment metadata. The dataset captures the wide variability in video streaming performance, and unveils how mobile broadband is still not offering consistent quality guarantees across different countries and networks, especially for users on the move. We open source our complete software toolset and provide the video dataset as open data.

视频流在移动宽带(MBB)网络的终端用户中是一项非常流行的服务。DASH和WebRTC是移动视频传输中的两项关键技术。在这项工作中，我们在MONROE项目的背景下，通过使用分布在多个国家的大量可编程网络探针，实证地评估了运行中的MBB网络中DASH和WebRTC视频流的性能。我们从300多个视频流实验中收集了一个大型数据集。我们的数据集由网络跟踪、流会话期间捕获的性能指标和实验元数据组成。该数据集捕获了视频流性能的广泛差异，并揭示了移动宽带如何在不同国家和网络中提供一致的质量保证，特别是对于移动用户。我们开源了完整的软件工具集，并将视频数据集作为开放数据提供。

引用次数: 2

OpenCV.js: computer vision processing for the open web platform 面向开放web平台的计算机视觉处理

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208126

Sajjad Taheri, A. Veidenbaum, A. Nicolau, Ningxin Hu, M. Haghighat

The Web is the world's most ubiquitous compute platform and the foundation of digital economy. Ever since its birth in early 1990's, web capabilities have been increasing in both quantity and quality. However, in spite of all such progress, computer vision is not mainstream on the web yet. The reasons are historical and include lack of sufficient performance of JavaScript, lack of camera support in the standard web APIs, and lack of comprehensive computer-vision libraries. These problems are about to get solved, resulting in the potential of an immersive and perceptual web with transformational effects including in online shopping, education, and entertainment among others. This work aims to enable web with computer vision by bringing hundreds of OpenCV functions to the open web platform. OpenCV is the most popular computer-vision library with a comprehensive set of vision functions and a large developer community. OpenCV is implemented in C++ and up until now, it was not available in the web browsers without the help of unpopular native plugins. This work leverage OpenCV efficiency, completeness, API maturity, and its communitys collective knowledge. It is provided in a format that is easy for JavaScript engines to highly optimize and has an API that is easy for the web programmers to adopt and develop applications. In addition, OpenCV parallel implementations that target SIMD units and multiprocessors can be ported to equivalent web primitives, providing better performance for real-time and interactive use cases.

网络是世界上最普遍的计算平台，是数字经济的基础。自20世纪90年代初诞生以来，web功能在数量和质量上都在不断提高。然而，尽管取得了这些进步，计算机视觉在网络上还不是主流。原因是历史的，包括JavaScript缺乏足够的性能，标准web api中缺乏摄像头支持，以及缺乏全面的计算机视觉库。这些问题即将得到解决，从而产生一个沉浸式和感性网络的潜力，在网上购物、教育和娱乐等方面产生变革性的影响。这项工作旨在通过将数百个OpenCV功能引入开放的web平台，从而使web具有计算机视觉。OpenCV是最受欢迎的计算机视觉库，拥有全面的视觉函数集和大型开发人员社区。OpenCV是用c++实现的，直到现在，如果没有不受欢迎的本地插件的帮助，它还不能在web浏览器中使用。这项工作充分利用了OpenCV的效率、完整性、API成熟度及其社区的集体知识。它以易于JavaScript引擎高度优化的格式提供，并具有易于web程序员采用和开发应用程序的API。此外，针对SIMD单元和多处理器的OpenCV并行实现可以移植到等效的web原语，为实时和交互式用例提供更好的性能。

{"title":"OpenCV.js: computer vision processing for the open web platform","authors":"Sajjad Taheri, A. Veidenbaum, A. Nicolau, Ningxin Hu, M. Haghighat","doi":"10.1145/3204949.3208126","DOIUrl":"https://doi.org/10.1145/3204949.3208126","url":null,"abstract":"The Web is the world's most ubiquitous compute platform and the foundation of digital economy. Ever since its birth in early 1990's, web capabilities have been increasing in both quantity and quality. However, in spite of all such progress, computer vision is not mainstream on the web yet. The reasons are historical and include lack of sufficient performance of JavaScript, lack of camera support in the standard web APIs, and lack of comprehensive computer-vision libraries. These problems are about to get solved, resulting in the potential of an immersive and perceptual web with transformational effects including in online shopping, education, and entertainment among others. This work aims to enable web with computer vision by bringing hundreds of OpenCV functions to the open web platform. OpenCV is the most popular computer-vision library with a comprehensive set of vision functions and a large developer community. OpenCV is implemented in C++ and up until now, it was not available in the web browsers without the help of unpopular native plugins. This work leverage OpenCV efficiency, completeness, API maturity, and its communitys collective knowledge. It is provided in a format that is easy for JavaScript engines to highly optimize and has an API that is easy for the web programmers to adopt and develop applications. In addition, OpenCV parallel implementations that target SIMD units and multiprocessors can be ported to equivalent web primitives, providing better performance for real-time and interactive use cases.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114641422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Dynamic adaptive streaming for multi-viewpoint omnidirectional videos 多视点全方位视频的动态自适应流媒体

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204968

Xavier Corbillon, F. D. Simone, G. Simon, P. Frossard

Full immersion inside a Virtual Reality (VR) scene requires six Degrees of Freedom (6DoF) applications where the user is allowed to perform translational and rotational movements within the virtual space. The implementation of 6DoF applications is however still an open question. In this paper we study a multi-viewpoint (MVP) 360-degree video streaming system, where a scene is simultaneously captured by multiple omnidirectional video cameras. The user can only switch positions to predefined viewpoints (VPs). We focus on the new challenges that are introduced by adaptive MVP 360-degree video streaming. We introduce several options for video encoding with existing technologies, such as High Efficiency Video Coding (HEVC) and for the implementation of VP switching. We model three video-segment download strategies for an adaptive streaming client into Mixed Integer Linear Programming (MILP) problems: an omniscient download scheduler; one where the client proactively downloads all VPs to guarantee fast VP switch; one where the client reacts to the user's navigation pattern. We recorded a one MVP 360-degree video with three VPs, implemented a mobile MVP 360-degree video player, and recorded the viewing patterns of multiple users navigating the content. We solved the adaptive streaming optimization problems on this video considering the collected navigation traces. The results emphasize the gains obtained by using tiles in terms of objective quality of the delivered content. They also emphasize the importance of performing further study on VP switching prediction to reduce the bandwidth consumption and to measure the impact of VP switching delay on the subjective Quality of Experience (QoE).

完全沉浸在虚拟现实(VR)场景中需要六个自由度(6DoF)应用程序，允许用户在虚拟空间内进行平移和旋转运动。然而，6DoF应用程序的实现仍然是一个悬而未决的问题。本文研究了一种多视点(MVP) 360度视频流系统，其中多个全向摄像机同时捕获一个场景。用户只能将位置切换到预定义的视点(vp)。我们专注于自适应MVP 360度视频流引入的新挑战。我们介绍了几种现有技术的视频编码选项，如高效视频编码(HEVC)和VP切换的实现。我们将自适应流媒体客户端的三种视频片段下载策略建模为混合整数线性规划(MILP)问题:全知下载调度程序;一是客户端主动下载所有VP，保证VP的快速切换;一个是客户端对用户的导航模式作出反应。我们用三个副总裁录制了一个MVP 360度视频，实现了一个移动MVP 360度视频播放器，并记录了多个用户浏览内容的观看模式。考虑到采集到的导航轨迹，我们解决了该视频的自适应流优化问题。结果强调了使用瓷砖在交付内容的客观质量方面所获得的收益。他们还强调了进一步研究VP切换预测以减少带宽消耗和测量VP切换延迟对主观体验质量(QoE)的影响的重要性。

{"title":"Dynamic adaptive streaming for multi-viewpoint omnidirectional videos","authors":"Xavier Corbillon, F. D. Simone, G. Simon, P. Frossard","doi":"10.1145/3204949.3204968","DOIUrl":"https://doi.org/10.1145/3204949.3204968","url":null,"abstract":"Full immersion inside a Virtual Reality (VR) scene requires six Degrees of Freedom (6DoF) applications where the user is allowed to perform translational and rotational movements within the virtual space. The implementation of 6DoF applications is however still an open question. In this paper we study a multi-viewpoint (MVP) 360-degree video streaming system, where a scene is simultaneously captured by multiple omnidirectional video cameras. The user can only switch positions to predefined viewpoints (VPs). We focus on the new challenges that are introduced by adaptive MVP 360-degree video streaming. We introduce several options for video encoding with existing technologies, such as High Efficiency Video Coding (HEVC) and for the implementation of VP switching. We model three video-segment download strategies for an adaptive streaming client into Mixed Integer Linear Programming (MILP) problems: an omniscient download scheduler; one where the client proactively downloads all VPs to guarantee fast VP switch; one where the client reacts to the user's navigation pattern. We recorded a one MVP 360-degree video with three VPs, implemented a mobile MVP 360-degree video player, and recorded the viewing patterns of multiple users navigating the content. We solved the adaptive streaming optimization problems on this video considering the collected navigation traces. The results emphasize the gains obtained by using tiles in terms of objective quality of the delivered content. They also emphasize the importance of performing further study on VP switching prediction to reduce the bandwidth consumption and to measure the impact of VP switching delay on the subjective Quality of Experience (QoE).","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116868945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Multi-path multi-tier 360-degree video streaming in 5G networks 5G网络中的多路径多层360度视频流

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204978

Liyang Sun, Fanyi Duanmu, Yong Liu, Yao Wang, Y. Ye, Hang Shi, David H. Dai

360° video streaming is a key component of the emerging Virtual Reality (VR) and Augmented Reality (AR) applications. In 360° video streaming, a user may freely navigate through the captured 360° video scene by changing her desired Field-of-View. High-throughput and low-delay data transfers enabled by 5G wireless networks can potentially facilitate untethered 360° video streaming experience. Meanwhile, the high volatility of 5G wireless links present unprecedented challenges for smooth 360° video streaming. In this paper, novel multi-path multi-tier 360° video streaming solutions are developed to simultaneously address the dynamics in both network bandwidth and user viewing direction. We systematically investigate various design trade-offs on streaming quality and robustness. Through simulations driven by real 5G network bandwidth traces and user viewing direction traces, we demonstrate that the proposed 360° video streaming solutions can achieve a high-level of Quality-of-Experience (QoE) in the challenging 5G wireless network environment.

360°视频流是新兴的虚拟现实(VR)和增强现实(AR)应用的关键组成部分。在360°视频流中，用户可以通过改变她想要的视场来自由地浏览捕获的360°视频场景。5G无线网络支持的高吞吐量和低延迟数据传输可以潜在地促进不受限制的360°视频流体验。同时，5G无线链路的高波动性对流畅的360°视频流提出了前所未有的挑战。本文开发了一种新颖的多路径多层360°视频流解决方案，以同时解决网络带宽和用户观看方向的动态问题。我们系统地研究了流质量和鲁棒性的各种设计权衡。通过由真实5G网络带宽轨迹和用户观看方向轨迹驱动的仿真，我们证明了所提出的360°视频流解决方案可以在具有挑战性的5G无线网络环境中实现高水平的体验质量(QoE)。

引用次数: 70

A DASH video streaming system for immersive contents 用于沉浸式内容的DASH视频流系统

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208107

Giuseppe Ribezzo, Giuseppe Samela, Vittorio Palmisano, L. D. Cicco, S. Mascolo

Virtual Reality/Augmented Reality applications require streaming 360° videos to implement new services in a diverse set of fields such as entertainment, art, e-health, e-learning, and smart factories. Providing a high Quality of Experience when streaming 360° videos is particularly challenging due to the very high required network bandwidth. In this paper, we showcase a proof-of-concept implementation of a complete DASH-compliant delivery system for 360° videos that: 1) allows reducing the required bitrate, 2) is independent of the employed encoder, 3) leverages technologies that are already available in the vast majority of mobile platforms and devices. The demo platform allows the user to directly experiment with various parameters, such as the duration of segments, the compression scheme, and the adaptive streaming algorithm parameters.

虚拟现实/增强现实应用程序需要流媒体360°视频来实现各种领域的新服务，如娱乐，艺术，电子医疗，电子学习和智能工厂。由于对网络带宽的要求非常高，因此在流媒体360°视频时提供高质量的体验尤其具有挑战性。在本文中，我们展示了一个完整的符合dash的360°视频传输系统的概念验证实现:1)允许降低所需的比特率，2)独立于所使用的编码器，3)利用绝大多数移动平台和设备中已经可用的技术。演示平台允许用户直接实验各种参数，如片段的持续时间、压缩方案和自适应流算法参数。

引用次数: 5

Favor: fine-grained video rate adaptation 优点:细粒度视频速率适应

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204957

Jian He, M. Qureshi, L. Qiu, Jin Li, Feng Li, Lei Han

Video rate adaptation has large impact on quality of experience (QoE). However, existing video rate adaptation is rather limited due to a small number of rate choices, which results in (i) under-selection, (ii) rate fluctuation, and (iii) frequent rebuffering. Moreover, selecting a single video rate for a 360° video can be even more limiting, since not all portions of a video frame are equally important. To address these limitations, we identify new dimensions to adapt user QoE - dropping video frames, slowing down video play rate, and adapting different portions in 360° videos. These new dimensions along with rate adaptation give us a more fine-grained adaptation and significantly improve user QoE. We further develop a simple yet effective learning strategy to automatically adapt the buffer reservation to avoid performance degradation beyond optimization horizon. We implement our approach Favor in VLC, a well known open source media player, and demonstrate that Favor on average out-performs Model Predictive Control (MPC), rate-based, and buffer-based adaptation for regular videos by 24%, 36%, and 41%, respectively, and 2X for 360° videos.

视频速率适应对视频体验质量有很大的影响。然而，由于可供选择的速率很少，现有的视频速率适应相当有限，这导致(一)选择不足，(二)速率波动，以及(三)频繁重新缓冲。此外，为360°视频选择单一视频速率可能会更加受限，因为并非视频帧的所有部分都同样重要。为了解决这些限制，我们确定了新的维度来适应用户QoE——视频帧下降、视频播放速率减慢以及在360°视频中适应不同的部分。这些新的维度以及速率适应为我们提供了更细粒度的适应，并显著提高了用户QoE。我们进一步开发了一种简单而有效的学习策略来自动适应缓冲区保留，以避免超出优化范围的性能下降。我们在VLC(一个著名的开源媒体播放器)中实现了我们的方法Favor，并证明了Favor在常规视频中平均优于模型预测控制(MPC)、基于速率和基于缓冲的自适应，分别为24%、36%和41%，对于360°视频则为2X。

引用次数: 8

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 9th ACM Multimedia Systems Conference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀