This paper introduces a new parallel algorithm (PA) for fast hand shape classification. This problem is challenging as a hand is characterized by a high number of degrees of freedom. Our objective is to design and implement a robust algorithm suitable for real-time applications. We show how the analysis time can be decreased, together with the increase of the classification accuracy, by the means of parallelization. Also, we propose to combine the shape contexts approach with the appearance-based techniques to increase the efficacy of the PA. An extensive experimental study confirms the effectiveness of the proposed PA compared with other state-of-the-art methods.
{"title":"Parallel Hand Shape Classification","authors":"J. Nalepa, M. Kawulok","doi":"10.1109/ISM.2013.76","DOIUrl":"https://doi.org/10.1109/ISM.2013.76","url":null,"abstract":"This paper introduces a new parallel algorithm (PA) for fast hand shape classification. This problem is challenging as a hand is characterized by a high number of degrees of freedom. Our objective is to design and implement a robust algorithm suitable for real-time applications. We show how the analysis time can be decreased, together with the increase of the classification accuracy, by the means of parallelization. Also, we propose to combine the shape contexts approach with the appearance-based techniques to increase the efficacy of the PA. An extensive experimental study confirms the effectiveness of the proposed PA compared with other state-of-the-art methods.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"1 1","pages":"401-402"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88596890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The advent of pen-based user interfaces has facilitated several natural ways for human-computer interaction. One example is sketch-based retrieval, i.e., the search for (multimedia) objects on the basis of sketches as query input. So far, work has focused mainly on sketch-based image retrieval. However, more and more application domains also benefit from sketches as query input for searching in video collections. Enabling spatial search in videos, in the form of sketch-based motion queries, is increasingly demanded by coaches and analysts in team sports as a novel and innovative tool for game analysis. Even though game analysis is already a major activity in this domain, it is still mostly based on manual selection of video sequences. In this paper, we present Sport Sense, a first approach to enabling intuitive and efficient video retrieval using sketch-based motion queries. This is accomplished by using videos of games in team sports, together with an overlay of meta data that incorporates spatio-temporal information about various events. Sport Sense exploits spatio-temporal databases to store, index, and retrieve the tracked information at interactive response times. Moreover, it provides first intuitive user input interfaces for sketches representing motion paths. A particular challenge is to convert the users' sketches into spatial queries and to execute these queries in a flexible way that allows for some controlled deviation between the sketched path and the actual movement of the players and/or the ball. The evaluation results of Sport Sense show that this approach to sketch-based retrieval in sports videos is both very effective and efficient.
{"title":"Towards Sketch-Based Motion Queries in Sports Videos","authors":"Ihab Al Kabary, H. Schuldt","doi":"10.1109/ISM.2013.60","DOIUrl":"https://doi.org/10.1109/ISM.2013.60","url":null,"abstract":"The advent of pen-based user interfaces has facilitated several natural ways for human-computer interaction. One example is sketch-based retrieval, i.e., the search for (multimedia) objects on the basis of sketches as query input. So far, work has focused mainly on sketch-based image retrieval. However, more and more application domains also benefit from sketches as query input for searching in video collections. Enabling spatial search in videos, in the form of sketch-based motion queries, is increasingly demanded by coaches and analysts in team sports as a novel and innovative tool for game analysis. Even though game analysis is already a major activity in this domain, it is still mostly based on manual selection of video sequences. In this paper, we present Sport Sense, a first approach to enabling intuitive and efficient video retrieval using sketch-based motion queries. This is accomplished by using videos of games in team sports, together with an overlay of meta data that incorporates spatio-temporal information about various events. Sport Sense exploits spatio-temporal databases to store, index, and retrieve the tracked information at interactive response times. Moreover, it provides first intuitive user input interfaces for sketches representing motion paths. A particular challenge is to convert the users' sketches into spatial queries and to execute these queries in a flexible way that allows for some controlled deviation between the sketched path and the actual movement of the players and/or the ball. The evaluation results of Sport Sense show that this approach to sketch-based retrieval in sports videos is both very effective and efficient.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"25 1","pages":"309-314"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73497888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a shot boundary decision fusion strategy which implements a multi-modal cascaded dichotomic search on the boundary space. The initial and core step of the proposed method is narrowing the shot boundary decision space as long as the accuracy is improved. Instead of the default sequential change detection, a dichotomic change strategy which is supervised with a cascaded fusion, is implemented to achieve higher accuracy and less algorithmic complexity. The main decision sources are image color histograms, object recognizer results, motion comparators, audio pattern analyzers, key point extractors and edge descriptors which are selectively employed in a cascaded manner. We propose a shot boundary detection algorithm which is noise tolerant, video genre adaptable, context aware and computationally efficient. In order to reduce computational complexity, we construct a shot boundary search heuristic for pruning the set of candidate shot boundary frames. We employ both statistical and rule based approaches in a cascaded fashion in order to decide the size of the search space to be pruned for the purposes of improving computational efficiency. TRECVid 2006 and 2007 data sets are used in the evaluation process and the performance results are given for both cuts and gradual transitions.
{"title":"Dichotomic Decision Cascading for Video Shot Boundary Detection","authors":"Mennan Güder, N. Cicekli","doi":"10.1109/ISM.2013.43","DOIUrl":"https://doi.org/10.1109/ISM.2013.43","url":null,"abstract":"In this paper, we present a shot boundary decision fusion strategy which implements a multi-modal cascaded dichotomic search on the boundary space. The initial and core step of the proposed method is narrowing the shot boundary decision space as long as the accuracy is improved. Instead of the default sequential change detection, a dichotomic change strategy which is supervised with a cascaded fusion, is implemented to achieve higher accuracy and less algorithmic complexity. The main decision sources are image color histograms, object recognizer results, motion comparators, audio pattern analyzers, key point extractors and edge descriptors which are selectively employed in a cascaded manner. We propose a shot boundary detection algorithm which is noise tolerant, video genre adaptable, context aware and computationally efficient. In order to reduce computational complexity, we construct a shot boundary search heuristic for pruning the set of candidate shot boundary frames. We employ both statistical and rule based approaches in a cascaded fashion in order to decide the size of the search space to be pruned for the purposes of improving computational efficiency. TRECVid 2006 and 2007 data sets are used in the evaluation process and the performance results are given for both cuts and gradual transitions.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"8 1","pages":"227-230"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84446436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shashank Mujumdar, N. Rajamani, L. V. Subramaniam, Dror Porat
With the recent dramatic increase in the popularity of mobile electronic devices equipped with cameras, there is a growing number of real-world applications for image classification. Nevertheless, some of these real-world applications aim to classify images captured in an unconstrained manner and in complex environments where existing image classification techniques may not perform well. We propose an efficient image classification system that is robust enough to cope with challenging imaging conditions, and demonstrate its effectiveness in the context of classification of real-world images of dumpsters captured by mobile phones in the Indian metropolitan city of Hyderabad. Our system is able to achieve accurate classification of the cleanliness state of the dumpsters despite the challenging uncontrolled urban environment by utilizing a multi-stage approach, where the first stage is the efficient detection of the dumpster, and the second stage is the classification of its state. We analyze the performance of the system and provide comprehensive experimental results on a real-world public dataset.
{"title":"Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments","authors":"Shashank Mujumdar, N. Rajamani, L. V. Subramaniam, Dror Porat","doi":"10.1109/ISM.2013.45","DOIUrl":"https://doi.org/10.1109/ISM.2013.45","url":null,"abstract":"With the recent dramatic increase in the popularity of mobile electronic devices equipped with cameras, there is a growing number of real-world applications for image classification. Nevertheless, some of these real-world applications aim to classify images captured in an unconstrained manner and in complex environments where existing image classification techniques may not perform well. We propose an efficient image classification system that is robust enough to cope with challenging imaging conditions, and demonstrate its effectiveness in the context of classification of real-world images of dumpsters captured by mobile phones in the Indian metropolitan city of Hyderabad. Our system is able to achieve accurate classification of the cleanliness state of the dumpsters despite the challenging uncontrolled urban environment by utilizing a multi-stage approach, where the first stage is the efficient detection of the dumpster, and the second stage is the classification of its state. We analyze the performance of the system and provide comprehensive experimental results on a real-world public dataset.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"76 1","pages":"237-240"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79718817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environmental sounds (ES) have different characteristics, such as unstructured nature and typically noise-like and flat spectrums, which make recognition task difficult compared to speech or music sounds. Here, we perform an exhaustive feature and classifier analysis for the recognition of considerably similar ES categories and propose a best representative feature to yield higher recognition accuracy. In the experiments, thirteen (13) ES categories, namely emergency alarm, car horn, gun, explosion, automobile, helicopter, water, wind, rain, applause, crowd, and laughter are detected and tested based on eleven (11) audio features (MPEG-7 family, ZCR, MFCC, and combinations) by using the HMM and SVM classifiers. Extensive experiments have been conducted to demonstrate the effectiveness of these joint features for ES classification. Our experiments show that, the joint feature set ASFCS-H (Audio Spectrum Flatness, Centroid, Spread, and Audio Harmonicity) is the best representative feature set with an average F-measure value of 80.6%.
环境声音(ES)具有不同的特征,例如非结构化的性质和典型的噪声和平坦的频谱,与语音或音乐声音相比,这使得识别任务变得困难。在这里,我们对相当相似的ES类别的识别进行了详尽的特征和分类器分析,并提出了一个最佳代表性特征,以产生更高的识别精度。在实验中,基于11个音频特征(MPEG-7族、ZCR、MFCC和组合),使用HMM和SVM分类器对紧急报警、汽车喇叭、枪、爆炸、汽车、直升机、水、风、雨、掌声、人群、笑声等13个ES类别进行检测和测试。已经进行了大量的实验来证明这些联合特征对ES分类的有效性。实验表明,联合特征集ASFCS-H (Audio Spectrum Flatness, Centroid, Spread, and Audio Harmonicity)是最具代表性的特征集,平均f测量值为80.6%。
{"title":"Audio Feature and Classifier Analysis for Efficient Recognition of Environmental Sounds","authors":"C. Okuyucu, M. Sert, A. Yazıcı","doi":"10.1109/ISM.2013.29","DOIUrl":"https://doi.org/10.1109/ISM.2013.29","url":null,"abstract":"Environmental sounds (ES) have different characteristics, such as unstructured nature and typically noise-like and flat spectrums, which make recognition task difficult compared to speech or music sounds. Here, we perform an exhaustive feature and classifier analysis for the recognition of considerably similar ES categories and propose a best representative feature to yield higher recognition accuracy. In the experiments, thirteen (13) ES categories, namely emergency alarm, car horn, gun, explosion, automobile, helicopter, water, wind, rain, applause, crowd, and laughter are detected and tested based on eleven (11) audio features (MPEG-7 family, ZCR, MFCC, and combinations) by using the HMM and SVM classifiers. Extensive experiments have been conducted to demonstrate the effectiveness of these joint features for ES classification. Our experiments show that, the joint feature set ASFCS-H (Audio Spectrum Flatness, Centroid, Spread, and Audio Harmonicity) is the best representative feature set with an average F-measure value of 80.6%.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"50 1","pages":"125-132"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89814867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Schimanke, R. Mertens, O. Vornberger, Stephanie Vollmer
Learning requires repetition. Spaced repetition algorithms are aimed at reducing the number of times a learning item has to be accessed by the learner by scheduling item presentation based on psychological models. These models take into account learner performance on previous interactions with the learning item and the rate at which humans forget what they have learned. In recent years, spaced repetition learning software has become popular for simple learning tasks like flash cards used for learning vocabulary. This paper presents a prototype application that extends the spaced repetition learning approach to more complex content like the kind usually found in learning games. One major difference between this content and flash cards is that learning games usually contain a number of different tasks that convey the same underlying concept categories. To complicate matters, one task might even be classified as belonging to a number of independent or orthogonal categories. This paper explores how these categories can be modeled on the basis of a mobile game designed for training in the field of relational databases. We have chosen a mobile approach to leverage it's anytime/anyplace availability which allows a more precise scheduling by the spaced repetition algorithm.
{"title":"Multi Category Content Selection in Spaced Repetition Based Mobile Learning Games","authors":"Florian Schimanke, R. Mertens, O. Vornberger, Stephanie Vollmer","doi":"10.1109/ISM.2013.90","DOIUrl":"https://doi.org/10.1109/ISM.2013.90","url":null,"abstract":"Learning requires repetition. Spaced repetition algorithms are aimed at reducing the number of times a learning item has to be accessed by the learner by scheduling item presentation based on psychological models. These models take into account learner performance on previous interactions with the learning item and the rate at which humans forget what they have learned. In recent years, spaced repetition learning software has become popular for simple learning tasks like flash cards used for learning vocabulary. This paper presents a prototype application that extends the spaced repetition learning approach to more complex content like the kind usually found in learning games. One major difference between this content and flash cards is that learning games usually contain a number of different tasks that convey the same underlying concept categories. To complicate matters, one task might even be classified as belonging to a number of independent or orthogonal categories. This paper explores how these categories can be modeled on the basis of a mobile game designed for training in the field of relational databases. We have chosen a mobile approach to leverage it's anytime/anyplace availability which allows a more precise scheduling by the spaced repetition algorithm.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"62 3","pages":"468-473"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91466472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karina Fasolin, Renato Fileto, Marcelo Krüger, D. S. Kaster, Mônica Ribeiro Porto Ferreira, R. Cordeiro, A. Traina, C. Traina
This paper proposes an approach to efficiently execute conjunctive queries on big complex data together with their related conventional data. The basic idea is to horizontally fragment the database according to criteria frequently used in query predicates. The collection of fragments is indexed to efficiently find the fragment(s) whose contents satisfy some query predicate(s). The contents of each fragment are then indexed as well, to support efficient filtering of the fragment data according to other query predicate(s) conjunctively connected to the former. This strategy has been applied to a collection of more than 106 million images together with their related conventional data. Experimental results show considerable performance gain of the proposed approach for queries with conventional and similarity-based predicates, compared to the use of a unique metric index for the entire database contents.
{"title":"Efficient Execution of Conjunctive Complex Queries on Big Multimedia Databases","authors":"Karina Fasolin, Renato Fileto, Marcelo Krüger, D. S. Kaster, Mônica Ribeiro Porto Ferreira, R. Cordeiro, A. Traina, C. Traina","doi":"10.1109/ISM.2013.112","DOIUrl":"https://doi.org/10.1109/ISM.2013.112","url":null,"abstract":"This paper proposes an approach to efficiently execute conjunctive queries on big complex data together with their related conventional data. The basic idea is to horizontally fragment the database according to criteria frequently used in query predicates. The collection of fragments is indexed to efficiently find the fragment(s) whose contents satisfy some query predicate(s). The contents of each fragment are then indexed as well, to support efficient filtering of the fragment data according to other query predicate(s) conjunctively connected to the former. This strategy has been applied to a collection of more than 106 million images together with their related conventional data. Experimental results show considerable performance gain of the proposed approach for queries with conventional and similarity-based predicates, compared to the use of a unique metric index for the entire database contents.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"1 1","pages":"536-543"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83390028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Wulff, L. Rupp, Alexander Fecke, Kai-Christoph Hamborg
The LectureSight project aims to develop a cost-effective solution for automatic camera control for lecture recordings. An earlier work presented the prototypical implementation of the system. In this work we present the results form testing the system in two experiments: LectureSight instances were deployed in two rooms and the performance of the system was assessed in live lectures. Furthermore an experimental study has been conducted to investigate the usefulness of videos with presenter tracking for the learner. The experiment involved participants from two universities that were put into a simulated exam situation.
{"title":"The LectureSight System in Production Scenarios and Its Impact on Learning from Video Recorded Lectures","authors":"Benjamin Wulff, L. Rupp, Alexander Fecke, Kai-Christoph Hamborg","doi":"10.1109/ISM.2013.91","DOIUrl":"https://doi.org/10.1109/ISM.2013.91","url":null,"abstract":"The LectureSight project aims to develop a cost-effective solution for automatic camera control for lecture recordings. An earlier work presented the prototypical implementation of the system. In this work we present the results form testing the system in two experiments: LectureSight instances were deployed in two rooms and the performance of the system was assessed in live lectures. Furthermore an experimental study has been conducted to investigate the usefulness of videos with presenter tracking for the learner. The experiment involved participants from two universities that were put into a simulated exam situation.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"13 1","pages":"474-479"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72819849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, a novel method is presented to improve the quality of panoramic images on a spherically arranged multi sensor imaging system. The new method is composed of two parts. The first approach proposed is based on mapping the panorama generation problem onto a Markov Random Field (MRF) and then estimating posterior probabilities from initial likelihoods. The novelty of approach is based on extracting the prior evidence from the registration information of multiple cameras and estimating expected value on an undirected graph. The second part of the method is a geometrical approach targeting a better estimation for the initial priors, which is also not applied before. The aim of both approaches is to decrease the parallax errors and ghosting effects which occur due to the nature of multi camera systems. It is shown that instead of directly using independent intensity coefficients extracted from registration information, applying a neighborhood based local probability distribution for each pixel of panorama utilizing the registration information as prior gives better results. Visual comparisons are provided to show the achieved quality enhancement in terms of seamless and more natural panoramic image with less ghosting effects. Since the registration priors are used effectively with a single iteration step in a 4 connected neighborhood, the need for an intensity based loopy and iterative inference method is prohibited. Hence, the proposed methods are suitable for real-time hardware implementation. A hardware implementation of the method for real-time operation is proposed.
{"title":"Spherical Panorama Construction Using Multi Sensor Registration Priors and Its Real-Time Hardware","authors":"Omer Cogal, Vladan Popovic, Y. Leblebici","doi":"10.1109/ISM.2013.35","DOIUrl":"https://doi.org/10.1109/ISM.2013.35","url":null,"abstract":"In this work, a novel method is presented to improve the quality of panoramic images on a spherically arranged multi sensor imaging system. The new method is composed of two parts. The first approach proposed is based on mapping the panorama generation problem onto a Markov Random Field (MRF) and then estimating posterior probabilities from initial likelihoods. The novelty of approach is based on extracting the prior evidence from the registration information of multiple cameras and estimating expected value on an undirected graph. The second part of the method is a geometrical approach targeting a better estimation for the initial priors, which is also not applied before. The aim of both approaches is to decrease the parallax errors and ghosting effects which occur due to the nature of multi camera systems. It is shown that instead of directly using independent intensity coefficients extracted from registration information, applying a neighborhood based local probability distribution for each pixel of panorama utilizing the registration information as prior gives better results. Visual comparisons are provided to show the achieved quality enhancement in terms of seamless and more natural panoramic image with less ghosting effects. Since the registration priors are used effectively with a single iteration step in a 4 connected neighborhood, the need for an intensity based loopy and iterative inference method is prohibited. Hence, the proposed methods are suitable for real-time hardware implementation. A hardware implementation of the method for real-time operation is proposed.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"15 1","pages":"171-178"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87381873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper deals with Multi-View Video and Selectable Audio (MVV-SA) IP transmission, users can switch not only video but also audio according to a viewpoint change request. We evaluate QoE of MVV-SA by a subjective experiment. The evaluation is performed by the Semantic Differential (SD) method with 13 adjective pairs. In the subjective experiment, we ask assessors to evaluate 40 stimuli which consist of two kinds of UDP load traffic, two kinds of fixed additional delay, five kinds of playout buffering time, and selectable or un-selectable audio (i.e., MVV-SA or the previous MVV-A). As a result, MVV-SA gives higher presence to the user than MVV-A and then enhances QoE. We also conduct factor analysis to clarify component factors of QoE.
{"title":"Multidimensional QoE Assessment of Multi-view Video and Selectable Audio (MVV-SA) IP Transmission","authors":"Takuya Ishida, Toshiro Nunome","doi":"10.1109/ISM.2013.109","DOIUrl":"https://doi.org/10.1109/ISM.2013.109","url":null,"abstract":"This paper deals with Multi-View Video and Selectable Audio (MVV-SA) IP transmission, users can switch not only video but also audio according to a viewpoint change request. We evaluate QoE of MVV-SA by a subjective experiment. The evaluation is performed by the Semantic Differential (SD) method with 13 adjective pairs. In the subjective experiment, we ask assessors to evaluate 40 stimuli which consist of two kinds of UDP load traffic, two kinds of fixed additional delay, five kinds of playout buffering time, and selectable or un-selectable audio (i.e., MVV-SA or the previous MVV-A). As a result, MVV-SA gives higher presence to the user than MVV-A and then enhances QoE. We also conduct factor analysis to clarify component factors of QoE.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"141 1","pages":"534-535"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86250384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}