Yi-Zhang Chen, Corky Maigre, Min-Chun Hu, Kun-Chan Lan
An augmented reality (AR) system for acupuncture points localization is implemented on an Android smartphone. The user can utilize such a system to locate the relevant acupuncture point for the purpose of symptom relief (e.g. through acupressure).
{"title":"Localization of Acupoints using Augmented Reality","authors":"Yi-Zhang Chen, Corky Maigre, Min-Chun Hu, Kun-Chan Lan","doi":"10.1145/3083187.3083225","DOIUrl":"https://doi.org/10.1145/3083187.3083225","url":null,"abstract":"An augmented reality (AR) system for acupuncture points localization is implemented on an Android smartphone. The user can utilize such a system to locate the relevant acupuncture point for the purpose of symptom relief (e.g. through acupressure).","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127236902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, many researches are focusing on the haptic interaction with streaming data like RGBD video / point cloud stream captured by commodity depth sensors. Most previous methods use partial streaming data from depth sensors and only investigate haptic rendering of the rigid surface without complex physics simulation. Many virtual reality and tele-immersive applications such as medical training, and art designing require the complete scene and physics simulation. In this paper, we propose a stable haptic rendering method capable of interacting with streaming deformable surface in real-time. Our method applies KinectFusion for real-time reconstruction of real-world object surface instead of incomplete surface. While construction, it simultaneously uses hierarchical shape matching (HSM) method to simulate the surface deformation in haptic-enabled interaction. We have demonstrated how to combine the fusion and physics simulation of deformation together, and proposed a continuous collision detection method based on Truncated Signed Distance Function (TSDF). Furthermore, we propose a fast TSDF warping method to update the deformation to TSDF, and a proxy finding method to find the proxy position. The proposed method is able to simulate the haptic-enabled deformation of the 3D fusion surface. Therefore it provides a novel haptic interaction for virtual reality and 3D tele-immersive applications. Experimental results show that the proposed approach provides stable haptic rendering and fast simulation of 3D deformable surface.
{"title":"Real Time Stable Haptic Rendering Of 3D Deformable Streaming Surface","authors":"Yuan Tian, Chao Li, X. Guo, B. Prabhakaran","doi":"10.1145/3083187.3083198","DOIUrl":"https://doi.org/10.1145/3083187.3083198","url":null,"abstract":"In recent years, many researches are focusing on the haptic interaction with streaming data like RGBD video / point cloud stream captured by commodity depth sensors. Most previous methods use partial streaming data from depth sensors and only investigate haptic rendering of the rigid surface without complex physics simulation. Many virtual reality and tele-immersive applications such as medical training, and art designing require the complete scene and physics simulation. In this paper, we propose a stable haptic rendering method capable of interacting with streaming deformable surface in real-time. Our method applies KinectFusion for real-time reconstruction of real-world object surface instead of incomplete surface. While construction, it simultaneously uses hierarchical shape matching (HSM) method to simulate the surface deformation in haptic-enabled interaction. We have demonstrated how to combine the fusion and physics simulation of deformation together, and proposed a continuous collision detection method based on Truncated Signed Distance Function (TSDF). Furthermore, we propose a fast TSDF warping method to update the deformation to TSDF, and a proxy finding method to find the proxy position. The proposed method is able to simulate the haptic-enabled deformation of the 3D fusion surface. Therefore it provides a novel haptic interaction for virtual reality and 3D tele-immersive applications. Experimental results show that the proposed approach provides stable haptic rendering and fast simulation of 3D deformable surface.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126842105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Pogorelov, K. Randel, C. Griwodz, S. Eskeland, T. Lange, Dag Johansen, C. Spampinato, Duc-Tien Dang-Nguyen, M. Lux, P. Schmidt, M. Riegler, P. Halvorsen
Automatic detection of diseases by use of computers is an important, but still unexplored field of research. Such innovations may improve medical practice and refine health care systems all over the world. However, datasets containing medical images are hardly available, making reproducibility and comparison of approaches almost impossible. In this paper, we present KVASIR, a dataset containing images from inside the gastrointestinal (GI) tract. The collection of images are classified into three important anatomical landmarks and three clinically significant findings. In addition, it contains two categories of images related to endoscopic polyp removal. Sorting and annotation of the dataset is performed by medical doctors (experienced endoscopists). In this respect, KVASIR is important for research on both single- and multi-disease computer aided detection. By providing it, we invite and enable multimedia researcher into the medical domain of detection and retrieval.
{"title":"KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection","authors":"Konstantin Pogorelov, K. Randel, C. Griwodz, S. Eskeland, T. Lange, Dag Johansen, C. Spampinato, Duc-Tien Dang-Nguyen, M. Lux, P. Schmidt, M. Riegler, P. Halvorsen","doi":"10.1145/3083187.3083212","DOIUrl":"https://doi.org/10.1145/3083187.3083212","url":null,"abstract":"Automatic detection of diseases by use of computers is an important, but still unexplored field of research. Such innovations may improve medical practice and refine health care systems all over the world. However, datasets containing medical images are hardly available, making reproducibility and comparison of approaches almost impossible. In this paper, we present KVASIR, a dataset containing images from inside the gastrointestinal (GI) tract. The collection of images are classified into three important anatomical landmarks and three clinically significant findings. In addition, it contains two categories of images related to endoscopic polyp removal. Sorting and annotation of the dataset is performed by medical doctors (experienced endoscopists). In this respect, KVASIR is important for research on both single- and multi-disease computer aided detection. By providing it, we invite and enable multimedia researcher into the medical domain of detection and retrieval.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122655207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With applications ranging from basic trajectory calculations to complex autonomous vehicle operations, detailed vehicle movement analysis has been getting more attention in academia and industry. So far, real-data driven analysis, e.g., utilizing advanced machine-learning, has used data from sensors such as GPS and accelerometer. However, such research requires quality datasets to enable accurate analysis. To that end, we have collected real vehicle movement data, Multimedia Sensor Data, that contain synchronized sensor data in fine granularity such as GPS, accelerometer, digital compass, gyroscope, and, most importantly, matching real video images recorded at driving time. These real video images provide a way to accurately label the sensor data in generating a quality dataset, e.g., a training dataset. Then, we performed preprocessing steps to clean and refine the raw data, subsequently converted the results into csv files, which are compatible with a wide variety of analysis tools. We also provided sample cases to demonstrate methods of identifying abnormal driving patterns such as moving over a speed bump. This dataset will be useful for researchers refining their analyses of vehicle movements.
{"title":"Multimedia Sensor Dataset for the Analysis of Vehicle Movement","authors":"Wonhee Cho, S. H. Kim","doi":"10.1145/3083187.3083217","DOIUrl":"https://doi.org/10.1145/3083187.3083217","url":null,"abstract":"With applications ranging from basic trajectory calculations to complex autonomous vehicle operations, detailed vehicle movement analysis has been getting more attention in academia and industry. So far, real-data driven analysis, e.g., utilizing advanced machine-learning, has used data from sensors such as GPS and accelerometer. However, such research requires quality datasets to enable accurate analysis. To that end, we have collected real vehicle movement data, Multimedia Sensor Data, that contain synchronized sensor data in fine granularity such as GPS, accelerometer, digital compass, gyroscope, and, most importantly, matching real video images recorded at driving time. These real video images provide a way to accurately label the sensor data in generating a quality dataset, e.g., a training dataset. Then, we performed preprocessing steps to clean and refine the raw data, subsequently converted the results into csv files, which are compatible with a wide variety of analysis tools. We also provided sample cases to demonstrate methods of identifying abnormal driving patterns such as moving over a speed bump. This dataset will be useful for researchers refining their analyses of vehicle movements.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125499980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teemu Kämäräinen, M. Siekkinen, Antti Ylä-Jääski, Wenxiao Zhang, P. Hui
Cloud gaming is a relatively new paradigm in which the game is rendered in the cloud and is streamed to an end-user device through a thin client. Latency is a key challenge for cloud gaming. In order to optimize the end-to-end latency, it is first necessary to understand how the end-to-end latency builds up from the mobile device to the cloud gaming server. In this paper we dissect the delays occurring in the mobile device and measure access delays in various networks and network conditions. We also perform a Europe-wide latency measurement study to find the optimal server locations and see how the number of server locations affects the network delay. The results are compared to limits found for perceivable delays in recent human-computer interaction studies. We show that the limits can be achieved only with the latest mobile devices with specific control methods. In addition, we study the expected latency reduction by near future technological development and show that its potential impact is bigger on the end-to-end latency than that of replication of the service and server placement optimization.
{"title":"A Measurement Study on Achieving Imperceptible Latency in Mobile Cloud Gaming","authors":"Teemu Kämäräinen, M. Siekkinen, Antti Ylä-Jääski, Wenxiao Zhang, P. Hui","doi":"10.1145/3083187.3083191","DOIUrl":"https://doi.org/10.1145/3083187.3083191","url":null,"abstract":"Cloud gaming is a relatively new paradigm in which the game is rendered in the cloud and is streamed to an end-user device through a thin client. Latency is a key challenge for cloud gaming. In order to optimize the end-to-end latency, it is first necessary to understand how the end-to-end latency builds up from the mobile device to the cloud gaming server. In this paper we dissect the delays occurring in the mobile device and measure access delays in various networks and network conditions. We also perform a Europe-wide latency measurement study to find the optimal server locations and see how the number of server locations affects the network delay. The results are compared to limits found for perceivable delays in recent human-computer interaction studies. We show that the limits can be achieved only with the latest mobile devices with specific control methods. In addition, we study the expected latency reduction by near future technological development and show that its potential impact is bigger on the end-to-end latency than that of replication of the service and server placement optimization.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130777093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With Virtual Reality (VR) devices and content getting increasingly popular, understanding user behaviors in virtual environment is important for not only VR product design but also user experience improvement. In VR applications, the head movement is one of the most important user behaviors, which can reflect a user's visual attention, preference, and even unique motion pattern. However, to the best of our knowledge, no dataset containing this information is publicly available. In this paper, we present a head tracking dataset composed of 48 users (24 males and 24 females) watching 18 sphere videos from 5 categories. We carefully record how users watch the videos, how their heads move in each session, what directions they focus, and what content they can remember after each session. Based on this dataset, we show that people share certain common patterns in VR spherical video streaming, which are different from conventional video streaming. We believe the dataset can serve good resource for exploring user behavior patterns in VR applications.
{"title":"A Dataset for Exploring User Behaviors in VR Spherical Video Streaming","authors":"Chenglei Wu, Zhihao Tan, Zhi Wang, Shiqiang Yang","doi":"10.1145/3083187.3083210","DOIUrl":"https://doi.org/10.1145/3083187.3083210","url":null,"abstract":"With Virtual Reality (VR) devices and content getting increasingly popular, understanding user behaviors in virtual environment is important for not only VR product design but also user experience improvement. In VR applications, the head movement is one of the most important user behaviors, which can reflect a user's visual attention, preference, and even unique motion pattern. However, to the best of our knowledge, no dataset containing this information is publicly available. In this paper, we present a head tracking dataset composed of 48 users (24 males and 24 females) watching 18 sphere videos from 5 categories. We carefully record how users watch the videos, how their heads move in each session, what directions they focus, and what content they can remember after each session. Based on this dataset, we show that people share certain common patterns in VR spherical video streaming, which are different from conventional video streaming. We believe the dataset can serve good resource for exploring user behavior patterns in VR applications.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134327314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Pogorelov, S. Eskeland, T. Lange, C. Griwodz, K. Randel, H. Stensland, Duc-Tien Dang-Nguyen, C. Spampinato, Dag Johansen, M. Riegler, P. Halvorsen
Analysis of medical videos for detection of abnormalities and diseases requires both high precision and recall, but also real-time processing for live feedback and scalability for massive screening of entire populations. Existing work on this field does not provide the necessary combination of retrieval accuracy and performance.; AB@In this paper, a multimedia system is presented where the aim is to tackle automatic analysis of videos from the human gastrointestinal (GI) tract. The system includes the whole pipeline from data collection, processing and analysis, to visualization. The system combines filters using machine learning, image recognition and extraction of global and local image features. Furthermore, it is built in a modular way so that it can easily be extended. At the same time, it is developed for efficient processing in order to provide real-time feedback to the doctors. Our experimental evaluation proves that our system has detection and localisation accuracy at least as good as existing systems for polyp detection, it is capable of detecting a wider range of diseases, it can analyze video in real-time, and it has a low resource consumption for scalability.
{"title":"A Holistic Multimedia System for Gastrointestinal Tract Disease Detection","authors":"Konstantin Pogorelov, S. Eskeland, T. Lange, C. Griwodz, K. Randel, H. Stensland, Duc-Tien Dang-Nguyen, C. Spampinato, Dag Johansen, M. Riegler, P. Halvorsen","doi":"10.1145/3083187.3083189","DOIUrl":"https://doi.org/10.1145/3083187.3083189","url":null,"abstract":"Analysis of medical videos for detection of abnormalities and diseases requires both high precision and recall, but also real-time processing for live feedback and scalability for massive screening of entire populations. Existing work on this field does not provide the necessary combination of retrieval accuracy and performance.; AB@In this paper, a multimedia system is presented where the aim is to tackle automatic analysis of videos from the human gastrointestinal (GI) tract. The system includes the whole pipeline from data collection, processing and analysis, to visualization. The system combines filters using machine learning, image recognition and extraction of global and local image features. Furthermore, it is built in a modular way so that it can easily be extended. At the same time, it is developed for efficient processing in order to provide real-time feedback to the doctors. Our experimental evaluation proves that our system has detection and localisation accuracy at least as good as existing systems for polyp detection, it is capable of detecting a wider range of diseases, it can analyze video in real-time, and it has a low resource consumption for scalability.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124224295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present Unified Remix, our solution for adaptive bit-rate streaming of video presentations with inserted or edited content. The solution addresses three important challenges encountered when streaming personalized media presentations. First, it reduces vulnerability to ad blocking technologies and client-side playback deviations encountered when using manifest manipulation based methods. Second, it reduces storage and computational costs associated with alternative server side solutions such as brute force re-encoding or duplicate storage towards levels comparable to linear video streaming (VoD or Live). Third, it handles the multi-source, multi-DRM and multi-protocol aspects for modern video streaming natively in the workflow. The solution is based on a combination of existing proven streaming technologies such as Unified Origin and newly designed components such as the Remix MPEG-4 module. The framework uses standardized technologies such as MPEG-4 ISOBMFF, SMIL and MPEG-DASH. The components work together in a micro service architecture enabling flexible deployment using a (container) orchestration framework on premises or in the cloud. The solution is demonstrated in two use cases: content pre-/post/mid roll and Live Archive to VoD conversion. As many use cases can be implemented based upon Unified Remix, we envision it as a key component of professional video streaming platforms.
{"title":"Unified Remix: a Server Side Solution for Adaptive Bit-Rate Streaming with Inserted and Edited Media Content","authors":"Arjen Wagenaar, Dirk Griffioen, R. Mekuria","doi":"10.1145/3083187.3083227","DOIUrl":"https://doi.org/10.1145/3083187.3083227","url":null,"abstract":"We present Unified Remix, our solution for adaptive bit-rate streaming of video presentations with inserted or edited content. The solution addresses three important challenges encountered when streaming personalized media presentations. First, it reduces vulnerability to ad blocking technologies and client-side playback deviations encountered when using manifest manipulation based methods. Second, it reduces storage and computational costs associated with alternative server side solutions such as brute force re-encoding or duplicate storage towards levels comparable to linear video streaming (VoD or Live). Third, it handles the multi-source, multi-DRM and multi-protocol aspects for modern video streaming natively in the workflow. The solution is based on a combination of existing proven streaming technologies such as Unified Origin and newly designed components such as the Remix MPEG-4 module. The framework uses standardized technologies such as MPEG-4 ISOBMFF, SMIL and MPEG-DASH. The components work together in a micro service architecture enabling flexible deployment using a (container) orchestration framework on premises or in the cloud. The solution is demonstrated in two use cases: content pre-/post/mid roll and Live Archive to VoD conversion. As many use cases can be implemented based upon Unified Remix, we envision it as a key component of professional video streaming platforms.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"279 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121275103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junjue Wang, Brandon Amos, Anupam Das, P. Pillai, N. Sadeh, M. Satyanarayanan
We present OpenFace, our new open-source face recognition system that approaches state-of-the-art accuracy. Integrating OpenFace with inter-frame tracking, we build RTFace, a mechanism for denaturing video streams that selectively blurs faces according to specified policies at full frame rates. This enables privacy management for live video analytics while providing a secure approach for handling retrospective policy exceptions. Finally, we present a scalable, privacy-aware architecture for large camera networks using RTFace.
{"title":"A Scalable and Privacy-Aware IoT Service for Live Video Analytics","authors":"Junjue Wang, Brandon Amos, Anupam Das, P. Pillai, N. Sadeh, M. Satyanarayanan","doi":"10.1145/3083187.3083192","DOIUrl":"https://doi.org/10.1145/3083187.3083192","url":null,"abstract":"We present OpenFace, our new open-source face recognition system that approaches state-of-the-art accuracy. Integrating OpenFace with inter-frame tracking, we build RTFace, a mechanism for denaturing video streams that selectively blurs faces according to specified policies at full frame rates. This enables privacy management for live video analytics while providing a secure approach for handling retrospective policy exceptions. Finally, we present a scalable, privacy-aware architecture for large camera networks using RTFace.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122407911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
360° videos and Head-Mounted Displays (HMDs) are getting increasingly popular. However, streaming 360° videos to HMDs is challenging. This is because only video content in viewers' Field-of-Views (FoVs) is rendered, and thus sending complete 360° videos wastes resources, including network bandwidth, storage space, and processing power. Optimizing the 360° video streaming to HMDs is, however, highly data and viewer dependent, and thus dictates real datasets. However, to our best knowledge, such datasets are not available in the literature. In this paper, we present our datasets of both content data (such as image saliency maps and motion maps derived from 360° videos) and sensor data (such as viewer head positions and orientations derived from HMD sensors). We put extra efforts to align the content and sensor data using the timestamps in the raw log files. The resulting datasets can be used by researchers, engineers, and hobbyists to either optimize existing 360° video streaming applications (like rate-distortion optimization) and novel applications (like crowd-driven camera movements). We believe that our dataset will stimulate more research activities along this exciting new research direction.
{"title":"360° Video Viewing Dataset in Head-Mounted Virtual Reality","authors":"Wen-Chih Lo, Ching-Ling Fan, Jean Lee, Chun-Ying Huang, Kuan-Ta Chen, Cheng-Hsin Hsu","doi":"10.1145/3083187.3083219","DOIUrl":"https://doi.org/10.1145/3083187.3083219","url":null,"abstract":"360° videos and Head-Mounted Displays (HMDs) are getting increasingly popular. However, streaming 360° videos to HMDs is challenging. This is because only video content in viewers' Field-of-Views (FoVs) is rendered, and thus sending complete 360° videos wastes resources, including network bandwidth, storage space, and processing power. Optimizing the 360° video streaming to HMDs is, however, highly data and viewer dependent, and thus dictates real datasets. However, to our best knowledge, such datasets are not available in the literature. In this paper, we present our datasets of both content data (such as image saliency maps and motion maps derived from 360° videos) and sensor data (such as viewer head positions and orientations derived from HMD sensors). We put extra efforts to align the content and sensor data using the timestamps in the raw log files. The resulting datasets can be used by researchers, engineers, and hobbyists to either optimize existing 360° video streaming applications (like rate-distortion optimization) and novel applications (like crowd-driven camera movements). We believe that our dataset will stimulate more research activities along this exciting new research direction.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131885193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}