首页 > 最新文献

Proceedings of the 9th ACM Multimedia Systems Conference最新文献

英文 中文
Opensea
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208128
Konstantin Pogorelov, Zeno Albisser, O. Ostroukhova, M. Lux, Dag Johansen, P. Halvorsen, M. Riegler
This paper presents an open-source classification tool for image and video frame classification. The classification takes a search-based approach and relies on global and local image features. It has been shown to work with images as well as videos, and is able to perform the classification of video frames in real-time so that the output can be used while the video is recorded, playing, or streamed. OpenSea has been proven to perform comparable to state-of-the-art methods such as deep learning, at the same time performing much faster in terms of processing speed, and can be therefore seen as an easy to get and hard to beat baseline. We present a detailed description of the software, its installation and use. As a use case, we demonstrate the classification of polyps in colonoscopy videos based on a publicly available dataset. We conduct leave-one-out-cross-validation to show the potential of the software in terms of classification time and accuracy.
{"title":"Opensea","authors":"Konstantin Pogorelov, Zeno Albisser, O. Ostroukhova, M. Lux, Dag Johansen, P. Halvorsen, M. Riegler","doi":"10.1145/3204949.3208128","DOIUrl":"https://doi.org/10.1145/3204949.3208128","url":null,"abstract":"This paper presents an open-source classification tool for image and video frame classification. The classification takes a search-based approach and relies on global and local image features. It has been shown to work with images as well as videos, and is able to perform the classification of video frames in real-time so that the output can be used while the video is recorded, playing, or streamed. OpenSea has been proven to perform comparable to state-of-the-art methods such as deep learning, at the same time performing much faster in terms of processing speed, and can be therefore seen as an easy to get and hard to beat baseline. We present a detailed description of the software, its installation and use. As a use case, we demonstrate the classification of polyps in colonoscopy videos based on a publicly available dataset. We conduct leave-one-out-cross-validation to show the potential of the software in terms of classification time and accuracy.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127819253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Predicting the performance of virtual reality video streaming in mobile networks 预测移动网络中虚拟现实视频流的性能
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204966
R. I. T. D. C. Filho, M. C. Luizelli, M. T. Vega, Jeroen van der Hooft, Stefano Petrangeli, T. Wauters, F. Turck, L. Gaspary
The demand of Virtual Reality (VR) video streaming to mobile devices is booming, as VR becomes accessible to the general public. However, the variability of conditions of mobile networks affects the perception of this type of high-bandwidth-demanding services in unexpected ways. In this situation, there is a need for novel performance assessment models fit to the new VR applications. In this paper, we present PERCEIVE, a two-stage method for predicting the perceived quality of adaptive VR videos when streamed through mobile networks. By means of machine learning techniques, our approach is able to first predict adaptive VR video playout performance, using network Quality of Service (QoS) indicators as predictors. In a second stage, it employs the predicted VR video playout performance metrics to model and estimate end-user perceived quality. The evaluation of PERCEIVE has been performed considering a real-world environment, in which VR videos are streamed while subjected to LTE/4G network condition. The accuracy of PERCEIVE has been assessed by means of the residual error between predicted and measured values. Our approach predicts the different performance metrics of the VR playout with an average prediction error lower than 3.7% and estimates the perceived quality with a prediction error lower than 4% for over 90% of all the tested cases. Moreover, it allows us to pinpoint the QoS conditions that affect adaptive VR streaming services the most.
随着普通大众可以使用虚拟现实技术,移动设备对虚拟现实视频流的需求正在蓬勃发展。然而,移动网络条件的可变性会以意想不到的方式影响对这类高带宽要求服务的感知。在这种情况下,需要新的性能评估模型来适应新的虚拟现实应用。在本文中,我们提出了一种两阶段方法,用于预测通过移动网络传输的自适应VR视频的感知质量。通过机器学习技术,我们的方法能够首先预测自适应VR视频播放性能,使用网络服务质量(QoS)指标作为预测指标。在第二阶段,它采用预测的VR视频播放性能指标来建模和估计最终用户感知的质量。对感知的评估考虑了现实世界的环境,其中VR视频在LTE/4G网络条件下流式传输。利用预测值与实测值之间的残差对感知的精度进行了评价。我们的方法以低于3.7%的平均预测误差预测VR播放的不同性能指标,并以低于4%的预测误差估计感知质量,超过90%的测试用例。此外,它使我们能够确定影响自适应VR流媒体服务的QoS条件。
{"title":"Predicting the performance of virtual reality video streaming in mobile networks","authors":"R. I. T. D. C. Filho, M. C. Luizelli, M. T. Vega, Jeroen van der Hooft, Stefano Petrangeli, T. Wauters, F. Turck, L. Gaspary","doi":"10.1145/3204949.3204966","DOIUrl":"https://doi.org/10.1145/3204949.3204966","url":null,"abstract":"The demand of Virtual Reality (VR) video streaming to mobile devices is booming, as VR becomes accessible to the general public. However, the variability of conditions of mobile networks affects the perception of this type of high-bandwidth-demanding services in unexpected ways. In this situation, there is a need for novel performance assessment models fit to the new VR applications. In this paper, we present PERCEIVE, a two-stage method for predicting the perceived quality of adaptive VR videos when streamed through mobile networks. By means of machine learning techniques, our approach is able to first predict adaptive VR video playout performance, using network Quality of Service (QoS) indicators as predictors. In a second stage, it employs the predicted VR video playout performance metrics to model and estimate end-user perceived quality. The evaluation of PERCEIVE has been performed considering a real-world environment, in which VR videos are streamed while subjected to LTE/4G network condition. The accuracy of PERCEIVE has been assessed by means of the residual error between predicted and measured values. Our approach predicts the different performance metrics of the VR playout with an average prediction error lower than 3.7% and estimates the perceived quality with a prediction error lower than 4% for over 90% of all the tested cases. Moreover, it allows us to pinpoint the QoS conditions that affect adaptive VR streaming services the most.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117250060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Popsift
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208136
C. Griwodz, L. Calvet, P. Halvorsen
The keypoint detector and descriptor Scalable Invariant Feature Transform (SIFT) [8] is famous for its ability to extract and describe keypoints in 2D images of natural scenes. It is used in ranging from object recognition to 3D reconstruction. However, SIFT is considered compute-heavy. This has led to the development of many keypoint extraction and description methods that sacrifice the wide applicability of SIFT for higher speed. We present our CUDA implementation named PopSift that does not sacrifice any detail of the SIFT algorithm, achieves a keypoint extraction and description performance that is as accurate as the best existing implementations, and runs at least 100x faster on a high-end consumer GPU than existing CPU implementations on a desktop CPU. Without any algorithmic trade-offs and short-cuts that sacrifice quality for speed, we extract at >25 fps from 1080p images with upscaling to 3840x2160 pixels on a high-end consumer GPU.
{"title":"Popsift","authors":"C. Griwodz, L. Calvet, P. Halvorsen","doi":"10.1145/3204949.3208136","DOIUrl":"https://doi.org/10.1145/3204949.3208136","url":null,"abstract":"The keypoint detector and descriptor Scalable Invariant Feature Transform (SIFT) [8] is famous for its ability to extract and describe keypoints in 2D images of natural scenes. It is used in ranging from object recognition to 3D reconstruction. However, SIFT is considered compute-heavy. This has led to the development of many keypoint extraction and description methods that sacrifice the wide applicability of SIFT for higher speed. We present our CUDA implementation named PopSift that does not sacrifice any detail of the SIFT algorithm, achieves a keypoint extraction and description performance that is as accurate as the best existing implementations, and runs at least 100x faster on a high-end consumer GPU than existing CPU implementations on a desktop CPU. Without any algorithmic trade-offs and short-cuts that sacrifice quality for speed, we extract at >25 fps from 1080p images with upscaling to 3840x2160 pixels on a high-end consumer GPU.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121991156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enhancing the experience of multiplayer shooter games via advanced lag compensation 通过先进的延迟补偿增强多人射击游戏的体验
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204971
Steven W. K. Lee, R. Chang
In multiplayer shooter games, lag compensation is used to mitigate the effects of network latency, or lag. Traditional lag compensation (TLC), however, introduces an inconsistency known as "shot behind covers" (SBC), especially to less lagged players. A few recent games ameliorate this problem by compensating only players with lag below a certain limit. This forces sufficiently lagged players to aim ahead of their targets, which is difficult and unrealistic. In this paper, we present a novel advanced lag compensation (ALC) algorithm. Based on TLC, this new algorithm retains the benefits of lag compensation but without compromising less lagged players or compensating only certain players. To evaluate ALC, we have invited players to play an FPS game we build from scratch and answer questions after each match. Comparing with TLC, ALC reduces the number of SBC by 94.1%, and a significant drop in the number of SBC reported by players during matches (p < .05) and the perceived SBC frequency collected at the end of each match (p < .05). ALC and TLC also share a similar hit registration accuracy (p = .158 and p = .18) and responsiveness (p = .317).
在多人射击游戏中,延迟补偿用于减轻网络延迟或延迟的影响。然而,传统的滞后补偿(TLC)会引入一种被称为“躲在掩体后面击球”(SBC)的不一致性,尤其是对那些落后较少的玩家。最近的一些游戏通过只补偿低于特定限制的玩家而改善了这个问题。这就迫使落后的玩家瞄准前方的目标,这既困难又不现实。本文提出了一种新的先进滞后补偿算法。基于TLC,新算法保留了延迟补偿的优点,但不影响较少滞后的玩家或只补偿某些玩家。为了评估ALC,我们邀请玩家玩一款我们从头开始制作的FPS游戏,并在每场比赛后回答问题。与TLC相比,ALC使SBC数量减少了94.1%,并且球员在比赛中报告的SBC数量(p < 0.05)和每场比赛结束时收集的感知SBC频率显著下降(p < 0.05)。ALC和TLC也有相似的命中登记准确性(p = .158和p = .18)和响应性(p = .317)。
{"title":"Enhancing the experience of multiplayer shooter games via advanced lag compensation","authors":"Steven W. K. Lee, R. Chang","doi":"10.1145/3204949.3204971","DOIUrl":"https://doi.org/10.1145/3204949.3204971","url":null,"abstract":"In multiplayer shooter games, lag compensation is used to mitigate the effects of network latency, or lag. Traditional lag compensation (TLC), however, introduces an inconsistency known as \"shot behind covers\" (SBC), especially to less lagged players. A few recent games ameliorate this problem by compensating only players with lag below a certain limit. This forces sufficiently lagged players to aim ahead of their targets, which is difficult and unrealistic. In this paper, we present a novel advanced lag compensation (ALC) algorithm. Based on TLC, this new algorithm retains the benefits of lag compensation but without compromising less lagged players or compensating only certain players. To evaluate ALC, we have invited players to play an FPS game we build from scratch and answer questions after each match. Comparing with TLC, ALC reduces the number of SBC by 94.1%, and a significant drop in the number of SBC reported by players during matches (p < .05) and the perceived SBC frequency collected at the end of each match (p < .05). ALC and TLC also share a similar hit registration accuracy (p = .158 and p = .18) and responsiveness (p = .317).","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122774927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Scalable distributed visual computing for line-rate video streams 线率视频流的可扩展分布式视觉计算
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204974
Chen Song, Jiacheng Chen, R. Shea, Andy Sun, Arrvindh Shriraman, Jiangchuan Liu
The past decade has witnessed significant breakthroughs in the world of computer vision. Recent deep learning-based computer vision algorithms exhibit strong performance on recognition, detection, and segmentation. While the development of vision algorithms elicits promising applications, it also presents immense computational challenge to the underlying hardware due to its complex nature, especially when attempting to process the data at line-rate. To this end we develop a highly scalable computer vision processing framework, which leverages advanced technologies such as Spark Streaming and OpenCV to achieve line-rate video data processing. To ensure the greatest flexibility, our framework is agnostic in terms of computer vision model, and can utilize environments with heterogeneous processing devices. To evaluate this framework, we deploy it in a production cloud computing environment, and perform a thorough analysis on the system's performance. We utilize existing real-world live video streams from Simon Fraser University to measure the number of cars entering our university campus. Further, the data collected from our experiments is being used for real-time predictions of traffic conditions on campus.
过去的十年见证了计算机视觉领域的重大突破。最近基于深度学习的计算机视觉算法在识别、检测和分割方面表现出很强的性能。虽然视觉算法的发展引发了有前景的应用,但由于其复杂性,它也对底层硬件提出了巨大的计算挑战,特别是当试图以线速率处理数据时。为此,我们开发了一个高度可扩展的计算机视觉处理框架,它利用Spark Streaming和OpenCV等先进技术来实现线率视频数据处理。为了确保最大的灵活性,我们的框架在计算机视觉模型方面是不可知的,并且可以利用具有异构处理设备的环境。为了评估这个框架,我们将其部署在生产云计算环境中,并对系统的性能进行了全面的分析。我们利用西蒙弗雷泽大学现有的真实世界的实时视频流来测量进入我们大学校园的汽车数量。此外,从我们的实验中收集的数据被用于校园交通状况的实时预测。
{"title":"Scalable distributed visual computing for line-rate video streams","authors":"Chen Song, Jiacheng Chen, R. Shea, Andy Sun, Arrvindh Shriraman, Jiangchuan Liu","doi":"10.1145/3204949.3204974","DOIUrl":"https://doi.org/10.1145/3204949.3204974","url":null,"abstract":"The past decade has witnessed significant breakthroughs in the world of computer vision. Recent deep learning-based computer vision algorithms exhibit strong performance on recognition, detection, and segmentation. While the development of vision algorithms elicits promising applications, it also presents immense computational challenge to the underlying hardware due to its complex nature, especially when attempting to process the data at line-rate. To this end we develop a highly scalable computer vision processing framework, which leverages advanced technologies such as Spark Streaming and OpenCV to achieve line-rate video data processing. To ensure the greatest flexibility, our framework is agnostic in terms of computer vision model, and can utilize environments with heterogeneous processing devices. To evaluate this framework, we deploy it in a production cloud computing environment, and perform a thorough analysis on the system's performance. We utilize existing real-world live video streams from Simon Fraser University to measure the number of cars entering our university campus. Further, the data collected from our experiments is being used for real-time predictions of traffic conditions on campus.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131592736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tip-on-a-chip: automatic dotting with glitter ink pen for individual identification of tiny parts 芯片上的尖端:用闪光墨水笔自动点点,用于微小部件的个人识别
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208116
Yuta Kudo, Hugo Zwaan, Toru Takahashi, Rui Ishiyama, P. Jonker
This paper presents a new identification system for tiny parts that have no space for applying conventional ID marking or tagging. The system marks the parts with a single dot using ink containing shiny particles. The particles in a single dot naturally form a unique pattern. The parts are then identified by matching microscopic images of this pattern with a database containing images of these dots. In this paper, we develop an automated system to conduct dotting and image capturing for mass-produced parts. Experimental results show that our "Tip-on-a-chip" system can uniquely identify more than ten thousand chip capacitors.
本文提出了一种新的微小部件识别系统,该系统可以用于没有空间的微小部件进行传统的标识或标签。该系统使用含有闪亮颗粒的墨水用单个点标记零件。单个点中的粒子自然形成独特的图案。然后通过将这种模式的显微图像与包含这些点的图像的数据库相匹配来识别这些部分。在本文中,我们开发了一个自动化系统进行点和图像捕获的批量生产的零件。实验结果表明,我们的“芯片上的尖端”系统可以唯一识别一万多个芯片电容器。
{"title":"Tip-on-a-chip: automatic dotting with glitter ink pen for individual identification of tiny parts","authors":"Yuta Kudo, Hugo Zwaan, Toru Takahashi, Rui Ishiyama, P. Jonker","doi":"10.1145/3204949.3208116","DOIUrl":"https://doi.org/10.1145/3204949.3208116","url":null,"abstract":"This paper presents a new identification system for tiny parts that have no space for applying conventional ID marking or tagging. The system marks the parts with a single dot using ink containing shiny particles. The particles in a single dot naturally form a unique pattern. The parts are then identified by matching microscopic images of this pattern with a database containing images of these dots. In this paper, we develop an automated system to conduct dotting and image capturing for mass-produced parts. Experimental results show that our \"Tip-on-a-chip\" system can uniquely identify more than ten thousand chip capacitors.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"37 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120925538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
AVtrack360
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208134
S. Fremerey, Ashutosh Singla, Kay Meseberg, Alexander Raake
In this paper, we present a viewing test with 48 subjects watching 20 different entertaining omnidirectional videos on an HTC Vive Head Mounted Display (HMD) in a task-free scenario. While the subjects were watching the contents, we recorded their head movements. The obtained dataset is publicly available in addition to the links and timestamps of the source contents used. Within this study, subjects were also asked to fill in the Simulator Sickness Questionnaire (SSQ) after every viewing session. Within this paper, at first SSQ results are presented. Several methods for evaluating head rotation data are presented and discussed. In the course of the study, the collected dataset is published along with the scripts for evaluating the head rotation data. The paper presents the general angular ranges of the subjects' exploration behavior as well as an analysis of the areas where most of the time was spent. The collected information can be presented as head-saliency maps, too. In case of videos, head-saliency data can be used for training saliency models, as information for evaluating decisions during content creation, or as part of streaming solutions for region-of-interest-specific coding as with the latest tile-based streaming solutions, as discussed also in standardization bodies such as MPEG.
{"title":"AVtrack360","authors":"S. Fremerey, Ashutosh Singla, Kay Meseberg, Alexander Raake","doi":"10.1145/3204949.3208134","DOIUrl":"https://doi.org/10.1145/3204949.3208134","url":null,"abstract":"In this paper, we present a viewing test with 48 subjects watching 20 different entertaining omnidirectional videos on an HTC Vive Head Mounted Display (HMD) in a task-free scenario. While the subjects were watching the contents, we recorded their head movements. The obtained dataset is publicly available in addition to the links and timestamps of the source contents used. Within this study, subjects were also asked to fill in the Simulator Sickness Questionnaire (SSQ) after every viewing session. Within this paper, at first SSQ results are presented. Several methods for evaluating head rotation data are presented and discussed. In the course of the study, the collected dataset is published along with the scripts for evaluating the head rotation data. The paper presents the general angular ranges of the subjects' exploration behavior as well as an analysis of the areas where most of the time was spent. The collected information can be presented as head-saliency maps, too. In case of videos, head-saliency data can be used for training saliency models, as information for evaluating decisions during content creation, or as part of streaming solutions for region-of-interest-specific coding as with the latest tile-based streaming solutions, as discussed also in standardization bodies such as MPEG.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"33 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116719166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A dataset of head and eye movements for 360° videos 360°视频的头部和眼睛运动数据集
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208139
Erwan J. David, Jesús Gutiérrez, A. Coutrot, Matthieu Perreira Da Silva, P. Callet
Research on visual attention in 360° content is crucial to understand how people perceive and interact with this immersive type of content and to develop efficient techniques for processing, encoding, delivering and rendering. And also to offer a high quality of experience to end users. The availability of public datasets is essential to support and facilitate research activities of the community. Recently, some studies have been presented analyzing exploration behaviors of people watching 360° videos, and a few datasets have been published. However, the majority of these works only consider head movements as proxy for gaze data, despite the importance of eye movements in the exploration of omnidirectional content. Thus, this paper presents a novel dataset of 360° videos with associated eye and head movement data, which is a follow-up to our previous dataset for still images [14]. Head and eye tracking data was obtained from 57 participants during a free-viewing experiment with 19 videos. In addition, guidelines on how to obtain saliency maps and scanpaths from raw data are provided. Also, some statistics related to exploration behaviors are presented, such as the impact of the longitudinal starting position when watching omnidirectional videos was investigated in this test. This dataset and its associated code are made publicly available to support research on visual attention for 360° content.
研究360°内容中的视觉注意力对于理解人们如何感知和与这种沉浸式内容互动以及开发有效的处理、编码、传递和渲染技术至关重要。同时也为终端用户提供高质量的体验。公共数据集的可用性对于支持和促进社区的研究活动至关重要。最近,有一些研究分析了人们观看360°视频的探索行为,并发表了一些数据集。然而,尽管眼球运动在全方位内容的探索中很重要,但这些研究大多只考虑头部运动作为凝视数据的代理。因此,本文提出了一个新颖的360°视频数据集,其中包含相关的眼睛和头部运动数据,这是我们之前的静态图像数据集的后续[14]。57名参与者在免费观看19个视频的实验中获得了头部和眼部追踪数据。此外,还提供了如何从原始数据中获得显著性图和扫描路径的指南。此外,还提供了一些与勘探行为相关的统计数据,例如在本试验中研究了观看全向视频时纵向起始位置的影响。该数据集及其相关代码公开提供,以支持对360°内容的视觉注意力的研究。
{"title":"A dataset of head and eye movements for 360° videos","authors":"Erwan J. David, Jesús Gutiérrez, A. Coutrot, Matthieu Perreira Da Silva, P. Callet","doi":"10.1145/3204949.3208139","DOIUrl":"https://doi.org/10.1145/3204949.3208139","url":null,"abstract":"Research on visual attention in 360° content is crucial to understand how people perceive and interact with this immersive type of content and to develop efficient techniques for processing, encoding, delivering and rendering. And also to offer a high quality of experience to end users. The availability of public datasets is essential to support and facilitate research activities of the community. Recently, some studies have been presented analyzing exploration behaviors of people watching 360° videos, and a few datasets have been published. However, the majority of these works only consider head movements as proxy for gaze data, despite the importance of eye movements in the exploration of omnidirectional content. Thus, this paper presents a novel dataset of 360° videos with associated eye and head movement data, which is a follow-up to our previous dataset for still images [14]. Head and eye tracking data was obtained from 57 participants during a free-viewing experiment with 19 videos. In addition, guidelines on how to obtain saliency maps and scanpaths from raw data are provided. Also, some statistics related to exploration behaviors are presented, such as the impact of the longitudinal starting position when watching omnidirectional videos was investigated in this test. This dataset and its associated code are made publicly available to support research on visual attention for 360° content.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134448066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 144
Want to play DASH?: a game theoretic approach for adaptive streaming over HTTP 想玩DASH吗?一种基于HTTP的自适应流的博弈论方法
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204961
A. Bentaleb, A. Begen, S. Harous, Roger Zimmermann
In streaming media, it is imperative to deliver a good viewer experience to preserve customer loyalty. Prior research has shown that this is rather difficult when shared Internet resources struggle to meet the demand from streaming clients that are largely designed to behave in their own self-interest. To date, several schemes for adaptive streaming have been proposed to address this challenge with varying success. In this paper, we take a different approach and develop a game theoretic approach. We present a practical implementation integrated in the dash.js reference player and provide substantial comparisons against the state-of-the-art methods using trace-driven and real-world experiments. Our approach outperforms its competitors in the average viewer experience by 38.5% and in video stability by 62%.
在流媒体中,提供良好的观众体验以保持客户忠诚度至关重要。先前的研究表明,当共享的互联网资源难以满足流媒体客户端的需求时,这是相当困难的,流媒体客户端在很大程度上是为了自己的利益而设计的。迄今为止,已经提出了几种自适应流的方案来解决这一挑战,并取得了不同的成功。在本文中,我们采取了一种不同的方法,并发展了一种博弈论的方法。我们提供了一个集成在dash.js参考播放器中的实际实现,并使用跟踪驱动和现实世界的实验,对最先进的方法进行了大量比较。我们的方法在平均观众体验方面比竞争对手高出38.5%,在视频稳定性方面比竞争对手高出62%。
{"title":"Want to play DASH?: a game theoretic approach for adaptive streaming over HTTP","authors":"A. Bentaleb, A. Begen, S. Harous, Roger Zimmermann","doi":"10.1145/3204949.3204961","DOIUrl":"https://doi.org/10.1145/3204949.3204961","url":null,"abstract":"In streaming media, it is imperative to deliver a good viewer experience to preserve customer loyalty. Prior research has shown that this is rather difficult when shared Internet resources struggle to meet the demand from streaming clients that are largely designed to behave in their own self-interest. To date, several schemes for adaptive streaming have been proposed to address this challenge with varying success. In this paper, we take a different approach and develop a game theoretic approach. We present a practical implementation integrated in the dash.js reference player and provide substantial comparisons against the state-of-the-art methods using trace-driven and real-world experiments. Our approach outperforms its competitors in the average viewer experience by 38.5% and in video stability by 62%.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"189 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120888339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Watermarked video delivery: traffic reduction and CDN management 水印视频分发:减少流量和CDN管理
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204964
Kun He, P. Maillé, G. Simon
In order to track the users who illegally re-stream live video streams, one solution is to embed identified watermark sequences in the video segments to distinguish the users. However, since all types of watermarked segments should be prepared, the existing solutions require an extra cost of bandwidth for delivery (at least multiplying by two the required bandwidth). In this paper, we study how to reduce the inner delivery (traffic) cost of a Content Delivery Network (CDN). We propose a mechanism that reduces the number of watermarked segments that need to be encoded and delivered. We calculate the best- and worst-case traffics for two different cases: multicast and unicast. The results illustrate that even in the worst cases, the traffic with our approach is much lower than without reducing. Moreover, the watermarked sequences can still maintain uniqueness for each user. Experiments based on a real database are carried out, and illustrate that our mechanism significantly reduces traffic with respect to the current CDN practice.
为了跟踪非法重播视频的用户,一种解决方案是在视频片段中嵌入可识别的水印序列来区分用户。然而,由于需要准备所有类型的水印段,现有的解决方案需要额外的带宽成本来交付(至少乘以所需带宽的两倍)。本文研究了如何降低内容分发网络(CDN)的内部分发(流量)成本。我们提出了一种减少需要编码和传递的水印段数量的机制。我们计算了两种不同情况下的最佳和最差流量:多播和单播。结果表明,即使在最坏的情况下,使用我们的方法的流量也比不减少的情况要低得多。而且,水印序列对于每个用户仍然可以保持唯一性。基于真实数据库进行的实验表明,相对于当前的CDN实践,我们的机制显著降低了流量。
{"title":"Watermarked video delivery: traffic reduction and CDN management","authors":"Kun He, P. Maillé, G. Simon","doi":"10.1145/3204949.3204964","DOIUrl":"https://doi.org/10.1145/3204949.3204964","url":null,"abstract":"In order to track the users who illegally re-stream live video streams, one solution is to embed identified watermark sequences in the video segments to distinguish the users. However, since all types of watermarked segments should be prepared, the existing solutions require an extra cost of bandwidth for delivery (at least multiplying by two the required bandwidth). In this paper, we study how to reduce the inner delivery (traffic) cost of a Content Delivery Network (CDN). We propose a mechanism that reduces the number of watermarked segments that need to be encoded and delivered. We calculate the best- and worst-case traffics for two different cases: multicast and unicast. The results illustrate that even in the worst cases, the traffic with our approach is much lower than without reducing. Moreover, the watermarked sequences can still maintain uniqueness for each user. Experiments based on a real database are carried out, and illustrate that our mechanism significantly reduces traffic with respect to the current CDN practice.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130853618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 9th ACM Multimedia Systems Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1