Paul Haimes, Tetsuaki Baba, Hiroya Suda, Kumiko Kushiyama
Fuji-chan is a simple ambient display device, which uses a wireless internet connection to monitor two important characteristics of Mount Fuji, Japan's largest mountain. This system utilises two internet-based data feeds to inform people of the weather conditions at the peak of the mountain, along with the current level of volcanic eruption risk. We consider the latter information in particular to be of great importance. These two data feeds are communicated via LEDs placed at the top and base of the device, along with aural output to indicate volcanic eruption warning levels. We also created a simple web interface for this information. By creating this device and application, we aim to reimagine how geospatial information can be presented, while also creating something which is visually appealing. Through the demonstration of this multimodal system, we also aim to promote the idea of an "Internet of Beautiful Things", where IOT technology is applied to interactive artworks.
{"title":"Fuji-chan: A unique IoT ambient display for monitoring Mount Fuji's conditions","authors":"Paul Haimes, Tetsuaki Baba, Hiroya Suda, Kumiko Kushiyama","doi":"10.1145/3083187.3083223","DOIUrl":"https://doi.org/10.1145/3083187.3083223","url":null,"abstract":"Fuji-chan is a simple ambient display device, which uses a wireless internet connection to monitor two important characteristics of Mount Fuji, Japan's largest mountain. This system utilises two internet-based data feeds to inform people of the weather conditions at the peak of the mountain, along with the current level of volcanic eruption risk. We consider the latter information in particular to be of great importance. These two data feeds are communicated via LEDs placed at the top and base of the device, along with aural output to indicate volcanic eruption warning levels. We also created a simple web interface for this information. By creating this device and application, we aim to reimagine how geospatial information can be presented, while also creating something which is visually appealing. Through the demonstration of this multimodal system, we also aim to promote the idea of an \"Internet of Beautiful Things\", where IOT technology is applied to interactive artworks.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128741538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Combining advanced sensors and powerful processing capabilities smart-phone based augmented reality (AR) is becoming increasingly prolific. The increase in prominence of these resource hungry AR applications poses significant challenges to energy constrained environments such as mobile-phones.; AB@To that end we present a platform for offloading AR applications to powerful cloud servers. We implement this system using a thin-client design and explore its performance using the real world application Pokemon Go as a case study. We show that with careful design a thin client is capable of offloading much of the AR processing to a cloud server, with the results being streamed back. Our initial experiments show substantial energy savings, low latency and excellent image quality even at relatively low bit-rates.
{"title":"Towards Fully Offloaded Cloud-based AR: Design, Implementation and Experience","authors":"R. Shea, Andy Sun, Silvery Fu, Jiangchuan Liu","doi":"10.1145/3083187.3084012","DOIUrl":"https://doi.org/10.1145/3083187.3084012","url":null,"abstract":"Combining advanced sensors and powerful processing capabilities smart-phone based augmented reality (AR) is becoming increasingly prolific. The increase in prominence of these resource hungry AR applications poses significant challenges to energy constrained environments such as mobile-phones.; AB@To that end we present a platform for offloading AR applications to powerful cloud servers. We implement this system using a thin-client design and explore its performance using the real world application Pokemon Go as a case study. We show that with careful design a thin client is capable of offloading much of the AR processing to a cloud server, with the results being streamed back. Our initial experiments show substantial energy savings, low latency and excellent image quality even at relatively low bit-rates.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129612698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present DroneFace, an open dataset for testing how well face recognition can work on drones. Because of the high mobility, drones, i.e. unmanned aerial vehicles (UAVs), are appropriate for surveillance, daily patrol or seeking lost people on the streets, and thus need the capability of tracking human targets' faces from the air. Under this context, drones' distances and heights from the targets influence the accuracy of face recognition. In order to test whether a face recognition technique is suitable for drones, we establish DroneFace composed of facial images taken from various combinations of distances and heights for evaluating how a face recognition technique works in recognizing designated faces from the air. Since Face recognition is one of the most successful application in image analysis and understanding, and there exist many face recognition database for various purposes. To the best of our knowledge, DroneFace is the only dataset including facial images taken from controlled distances and heights within unconstrained environment, and can be valuable for future study of integrating face recognition techniques onto drones.
{"title":"DroneFace: An Open Dataset for Drone Research","authors":"Hwai-Jung Hsu, Kuan-Ta Chen","doi":"10.1145/3083187.3083214","DOIUrl":"https://doi.org/10.1145/3083187.3083214","url":null,"abstract":"In this paper, we present DroneFace, an open dataset for testing how well face recognition can work on drones. Because of the high mobility, drones, i.e. unmanned aerial vehicles (UAVs), are appropriate for surveillance, daily patrol or seeking lost people on the streets, and thus need the capability of tracking human targets' faces from the air. Under this context, drones' distances and heights from the targets influence the accuracy of face recognition. In order to test whether a face recognition technique is suitable for drones, we establish DroneFace composed of facial images taken from various combinations of distances and heights for evaluating how a face recognition technique works in recognizing designated faces from the air. Since Face recognition is one of the most successful application in image analysis and understanding, and there exist many face recognition database for various purposes. To the best of our knowledge, DroneFace is the only dataset including facial images taken from controlled distances and heights within unconstrained environment, and can be valuable for future study of integrating face recognition techniques onto drones.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130112124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Users' QoE (Quality of Experience) in Multi-sensorial, Immersive, Collaborative Environments (MICE) applications is mostly measured by psychometric studies. These studies provide a subjective insight into the performance of such applications. In this paper, we hypothesize that spatial coherence or the lack of it of the embedded virtual objects among users has a correlation to the QoE in MICE. We use Position Discrepancy (PD) to model this lack of spatial coherence in MICE. Based on that, we propose a Hierarchical Position Discrepancy Model (HPDM) that computes PD at multiple levels to derive the application/system-level PD as a measure of performance.; AB@Experimental results on an example task in MICE show that HPDM can objectively quantify the application performance and has a correlation to the psychometric study-based QoE measurements. We envisage HPDM can provide more insight on the MICE application without the need for extensive user study.
{"title":"Modeling User Quality of Experience (QoE) through Position Discrepancy in Multi-Sensorial, Immersive, Collaborative Environments","authors":"Shanthi Vellingiri, Prabhakaran Balakrishnan","doi":"10.1145/3083187.3084018","DOIUrl":"https://doi.org/10.1145/3083187.3084018","url":null,"abstract":"Users' QoE (Quality of Experience) in Multi-sensorial, Immersive, Collaborative Environments (MICE) applications is mostly measured by psychometric studies. These studies provide a subjective insight into the performance of such applications. In this paper, we hypothesize that spatial coherence or the lack of it of the embedded virtual objects among users has a correlation to the QoE in MICE. We use Position Discrepancy (PD) to model this lack of spatial coherence in MICE. Based on that, we propose a Hierarchical Position Discrepancy Model (HPDM) that computes PD at multiple levels to derive the application/system-level PD as a measure of performance.; AB@Experimental results on an example task in MICE show that HPDM can objectively quantify the application performance and has a correlation to the psychometric study-based QoE measurements. We envisage HPDM can provide more insight on the MICE application without the need for extensive user study.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127807694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Chu, Chris Bryan, Min Shih, Leonardo Ferrer, K. Ma
Immersive, stereoscopic visualization enables scientists to better analyze structural and physical phenomena compared to traditional display mediums. Unfortunately, current head-mounted displays (HMDs) with the high rendering quality necessary for these complex datasets are prohibitively expensive, especially in educational settings where their high cost makes it impractical to buy several devices. To address this problem, we develop two tools: (1) An authoring tool allows domain scientists to generate a set of connected, 360° video paths for traversing between dimensional keyframes in the dataset. (2) A corresponding navigational interface is a video selection and playback tool that can be paired with a low-cost HMD to enable an interactive, non-linear, storytelling experience. We demonstrate the authoring tool's utility by conducting several case studies and assess the navigational interface with a usability study. Results show the potential of our approach in effectively expanding the accessibility of high-quality, immersive visualization to a wider audience using affordable HMDs.
{"title":"Navigable Videos for Presenting Scientific Data on Affordable Head-Mounted Displays","authors":"J. Chu, Chris Bryan, Min Shih, Leonardo Ferrer, K. Ma","doi":"10.1145/3083187.3084015","DOIUrl":"https://doi.org/10.1145/3083187.3084015","url":null,"abstract":"Immersive, stereoscopic visualization enables scientists to better analyze structural and physical phenomena compared to traditional display mediums. Unfortunately, current head-mounted displays (HMDs) with the high rendering quality necessary for these complex datasets are prohibitively expensive, especially in educational settings where their high cost makes it impractical to buy several devices. To address this problem, we develop two tools: (1) An authoring tool allows domain scientists to generate a set of connected, 360° video paths for traversing between dimensional keyframes in the dataset. (2) A corresponding navigational interface is a video selection and playback tool that can be paired with a low-cost HMD to enable an interactive, non-linear, storytelling experience. We demonstrate the authoring tool's utility by conducting several case studies and assess the navigational interface with a usability study. Results show the potential of our approach in effectively expanding the accessibility of high-quality, immersive visualization to a wider audience using affordable HMDs.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125146979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, Arailym Butabayeva, Rui Zheng, Morteza Golkarifard, P. Hui
We develop Hyperion a Wearable Augmented Reality (WAR) system based on Google Glass to access text information in the ambient environment. Hyperion is able to retrieve text content from users' current view and deliver the content to them in different ways according to their context. We design four work modalities for different situations that mobile users encounter in their daily activities. In addition, user interaction interfaces are provided to adapt to different application scenarios. Although Google Glass may be constrained by its poor computational capabilities and its limited battery capacity, we utilize code-level offloading to companion mobile devices to improve the runtime performance and the sustainability of WAR applications. System experiments show that Hyperion improves users ability to be aware of text information around them. Our prototype indicates promising potential of converging WAR technology and wearable devices such as Google Glass to improve people's daily activities.
{"title":"Hyperion: A Wearable Augmented Reality System for Text Extraction and Manipulation in the Air","authors":"Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, Arailym Butabayeva, Rui Zheng, Morteza Golkarifard, P. Hui","doi":"10.1145/3083187.3084017","DOIUrl":"https://doi.org/10.1145/3083187.3084017","url":null,"abstract":"We develop Hyperion a Wearable Augmented Reality (WAR) system based on Google Glass to access text information in the ambient environment. Hyperion is able to retrieve text content from users' current view and deliver the content to them in different ways according to their context. We design four work modalities for different situations that mobile users encounter in their daily activities. In addition, user interaction interfaces are provided to adapt to different application scenarios. Although Google Glass may be constrained by its poor computational capabilities and its limited battery capacity, we utilize code-level offloading to companion mobile devices to improve the runtime performance and the sustainability of WAR applications. System experiments show that Hyperion improves users ability to be aware of text information around them. Our prototype indicates promising potential of converging WAR technology and wearable devices such as Google Glass to improve people's daily activities.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127227701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chinese herbal medicine (CHM) plays an important role of treatment in traditional Chinese medicine (TCM). Traditionally, CHM is used to restore the balance of the body for sick people and maintain health for common people. However, lack of the knowledge of the herbs may cause misuse of the herbs. In this demo, we will present a real-time smartphone application, which can not only recognize easily-confused herb based on Convolutional Neural Network (CNN), but also provide relevant information about the detected herbs. Our Chinese herb recognition system is implemented on a cloud server and can be used by the client user via smartphone. The recognition system is evaluated by 5-fold cross validation method and the accuracy is around 96%, which is adequate for real-world use.
{"title":"Recognition of Easily-confused TCM Herbs Using Deep Learning","authors":"Juei-Chun Weng, Min-Chun Hu, Kun-Chan Lan","doi":"10.1145/3083187.3083226","DOIUrl":"https://doi.org/10.1145/3083187.3083226","url":null,"abstract":"Chinese herbal medicine (CHM) plays an important role of treatment in traditional Chinese medicine (TCM). Traditionally, CHM is used to restore the balance of the body for sick people and maintain health for common people. However, lack of the knowledge of the herbs may cause misuse of the herbs. In this demo, we will present a real-time smartphone application, which can not only recognize easily-confused herb based on Convolutional Neural Network (CNN), but also provide relevant information about the detected herbs. Our Chinese herb recognition system is implemented on a cloud server and can be used by the client user via smartphone. The recognition system is evaluated by 5-fold cross validation method and the accuracy is around 96%, which is adequate for real-world use.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
3D Tele-Immersion systems allow geographically distributed users to interact in a virtual world using their "live" 3D models. The capture, reconstruction, transfer, and rendering of these models introduce significant latency into the system. Implicit Latency (ℒ') can be estimated using system clocks to measure the time after the data was received from the RGB-D camera, till the request to render the result. The Observed Latency (ℒ) between a real world event and the event being rendered on the display, cannot be accurately represented by ℒ' since ℒ' ignores the time taken to capture, or update the display, etc. In this paper, a Visual Pattern based Latency Estimation (VPLE) approach is introduced to calculate the real world visual latency of a system without the need for any custom hardware. VPLE generates a constantly changing pattern that is captured and rendered by the 3DTI system. An external observer records both the pattern and the rendered results at high frame rates. ℒ is estimated by calculating the difference between the generated and rendered patterns. VPLE is extended to allow ℒ estimation between geographically distributed sites. Evaluations show that the accuracy of VPLE depends on the refresh rate of the pattern, and is within 4ms. ℒ of a distributed 3DTI system implemented on the GPU is significantly lower than the CPU implementation, and is comparable to video streaming. It is also shown that the ℒ' estimates for GPU based 3DTI implementations are off by almost 100% compared to the ℒ.
{"title":"A Visual Latency Estimator for 3D Tele-Immersion","authors":"S. Raghuraman, K. Bahirat, B. Prabhakaran","doi":"10.1145/3083187.3084019","DOIUrl":"https://doi.org/10.1145/3083187.3084019","url":null,"abstract":"3D Tele-Immersion systems allow geographically distributed users to interact in a virtual world using their \"live\" 3D models. The capture, reconstruction, transfer, and rendering of these models introduce significant latency into the system. Implicit Latency (ℒ') can be estimated using system clocks to measure the time after the data was received from the RGB-D camera, till the request to render the result. The Observed Latency (ℒ) between a real world event and the event being rendered on the display, cannot be accurately represented by ℒ' since ℒ' ignores the time taken to capture, or update the display, etc. In this paper, a Visual Pattern based Latency Estimation (VPLE) approach is introduced to calculate the real world visual latency of a system without the need for any custom hardware. VPLE generates a constantly changing pattern that is captured and rendered by the 3DTI system. An external observer records both the pattern and the rendered results at high frame rates. ℒ is estimated by calculating the difference between the generated and rendered patterns. VPLE is extended to allow ℒ estimation between geographically distributed sites. Evaluations show that the accuracy of VPLE depends on the refresh rate of the pattern, and is within 4ms. ℒ of a distributed 3DTI system implemented on the GPU is significantly lower than the CPU implementation, and is comparable to video streaming. It is also shown that the ℒ' estimates for GPU based 3DTI implementations are off by almost 100% compared to the ℒ.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127583152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Pogorelov, K. Randel, T. Lange, S. Eskeland, C. Griwodz, Dag Johansen, C. Spampinato, M. Taschwer, M. Lux, P. Schmidt, M. Riegler, P. Halvorsen
Bowel preparation (cleansing) is considered to be a key precondition for successful colonoscopy (endoscopic examination of the bowel). The degree of bowel cleansing directly affects the possibility to detect diseases and may influence decisions on screening and follow-up examination intervals. An accurate assessment of bowel preparation quality is therefore important. Despite the use of reliable and validated bowel preparation scales, the grading may vary from one doctor to another. An objective and automated assessment of bowel cleansing would contribute to reduce such inequalities and optimize use of medical resources. This would also be a valuable feature for automatic endoscopy reporting in the future. In this paper, we present Nerthus, a dataset containing videos from inside the gastrointestinal (GI) tract, showing different degrees of bowel cleansing. By providing this dataset, we invite multimedia researchers to contribute in the medical field by making systems automatically evaluate the quality of bowel cleansing for colonoscopy. Such innovations would probably contribute to improve the medical field of GI endoscopy.
{"title":"Nerthus: A Bowel Preparation Quality Video Dataset","authors":"Konstantin Pogorelov, K. Randel, T. Lange, S. Eskeland, C. Griwodz, Dag Johansen, C. Spampinato, M. Taschwer, M. Lux, P. Schmidt, M. Riegler, P. Halvorsen","doi":"10.1145/3083187.3083216","DOIUrl":"https://doi.org/10.1145/3083187.3083216","url":null,"abstract":"Bowel preparation (cleansing) is considered to be a key precondition for successful colonoscopy (endoscopic examination of the bowel). The degree of bowel cleansing directly affects the possibility to detect diseases and may influence decisions on screening and follow-up examination intervals. An accurate assessment of bowel preparation quality is therefore important. Despite the use of reliable and validated bowel preparation scales, the grading may vary from one doctor to another. An objective and automated assessment of bowel cleansing would contribute to reduce such inequalities and optimize use of medical resources. This would also be a valuable feature for automatic endoscopy reporting in the future. In this paper, we present Nerthus, a dataset containing videos from inside the gastrointestinal (GI) tract, showing different degrees of bowel cleansing. By providing this dataset, we invite multimedia researchers to contribute in the medical field by making systems automatically evaluate the quality of bowel cleansing for colonoscopy. Such innovations would probably contribute to improve the medical field of GI endoscopy.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"1606 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129210469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While Virtual Reality applications are increasingly attracting the attention of developers and business analysts, the behaviour of users watching 360-degree (i.e. omnidirectional) videos has not been thoroughly studied yet. This paper introduces a dataset of head movements of users watching 360-degree videos on a Head-Mounted Display (HMD). The dataset includes data collected from 59 users watching five 70 s-long 360-degree videos on the Razer OSVR HDK2 HMD. The selected videos span a wide range of 360-degree content for which different viewer's involvement, thus navigation patterns, could be expected. We describe the open-source software developed to produce the dataset and present the test material and viewing conditions considered during the data acquisition. Finally, we show some examples of statistics that can be extracted from the collected data, for a content-dependent analysis of users' navigation patterns. The source code of the software used to collect the data has been made publicly available, together with the entire dataset, to enable the community to extend the dataset.
{"title":"360-Degree Video Head Movement Dataset","authors":"Xavier Corbillon, F. D. Simone, G. Simon","doi":"10.1145/3083187.3083215","DOIUrl":"https://doi.org/10.1145/3083187.3083215","url":null,"abstract":"While Virtual Reality applications are increasingly attracting the attention of developers and business analysts, the behaviour of users watching 360-degree (i.e. omnidirectional) videos has not been thoroughly studied yet. This paper introduces a dataset of head movements of users watching 360-degree videos on a Head-Mounted Display (HMD). The dataset includes data collected from 59 users watching five 70 s-long 360-degree videos on the Razer OSVR HDK2 HMD. The selected videos span a wide range of 360-degree content for which different viewer's involvement, thus navigation patterns, could be expected. We describe the open-source software developed to produce the dataset and present the test material and viewing conditions considered during the data acquisition. Finally, we show some examples of statistics that can be extracted from the collected data, for a content-dependent analysis of users' navigation patterns. The source code of the software used to collect the data has been made publicly available, together with the entire dataset, to enable the community to extend the dataset.","PeriodicalId":123321,"journal":{"name":"Proceedings of the 8th ACM on Multimedia Systems Conference","volume":"558 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116240188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}