For network operators (telcos) the quality assessment over multimedia services has become a hot topic in recent years. We propose a model to simplify the assessment and correlation metrics QoS/QoE over video services. We report the development of a methodology that use metrics Full Reference (FR) and Reduced Reference (RR), through network scenarios using QoS strategies (Diffserv). We used a multivariate correlation model and a model based on ANN as PSQA (Pseudo subjective Quality Assessment) for accurately predicting subjective quality MOS metric of the user, taking into account the QoS defined. Simulation experiments show values that are correlated nicely with each metric and allow validating QoE values adjusted for real user perception.
{"title":"Strategies for Improving the QoE Assessment over iTV Platforms Based on QoS Metrics","authors":"Diego J. Botia, Natalia Gaviria Gómez","doi":"10.1109/ISM.2012.98","DOIUrl":"https://doi.org/10.1109/ISM.2012.98","url":null,"abstract":"For network operators (telcos) the quality assessment over multimedia services has become a hot topic in recent years. We propose a model to simplify the assessment and correlation metrics QoS/QoE over video services. We report the development of a methodology that use metrics Full Reference (FR) and Reduced Reference (RR), through network scenarios using QoS strategies (Diffserv). We used a multivariate correlation model and a model based on ANN as PSQA (Pseudo subjective Quality Assessment) for accurately predicting subjective quality MOS metric of the user, taking into account the QoS defined. Simulation experiments show values that are correlated nicely with each metric and allow validating QoE values adjusted for real user perception.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115673097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Double JPEG image compression detection, or more specifically, double quantization detection, is an important digital image forensic method to detect the presence of image forgery or tampering. In this paper, we introduce an improved double quantization detection method to improve the accuracy of JPEG image tampering detection. We evaluate our detection method using the publicly available CASIA authentic and tampered image data set of 9501 JPEG images. We carry out 20 rounds of experiments with stringent parameter setting placed on our detection method to demonstrate its robustness. Each round of classifier is generated from a unique, non-overlapping and small subset composing of 1/20 of the tampered and 1/72 of the authentic images, to obtain a training data set of about 100 images per class, with the rest of the 19/20 of the tampered and 71/72 of the authentic images used for testing. Through the experiments, we show an average improvement of 40.31% and 44.85% in the true negative (TN) rate and true positive (TP) rate, respectively, when compared with the current state-of-the-art method. The average TN and TP rates obtained from 20 rounds of experiments carried out using our detection method, are 90.81% and 76.95%, respectively. The experimental results show that our JPEG image forensics method can support a reliable large-scale digital image evidence authenticity verification with consistent good accuracy. The low training to testing data ratio also indicates that our method is robust in practical applications even with a relatively limited or small training data set available.
{"title":"An Improved Double Compression Detection Method for JPEG Image Forensics","authors":"V. Thing, Yu Chen, C. Cheh","doi":"10.1109/ISM.2012.61","DOIUrl":"https://doi.org/10.1109/ISM.2012.61","url":null,"abstract":"Double JPEG image compression detection, or more specifically, double quantization detection, is an important digital image forensic method to detect the presence of image forgery or tampering. In this paper, we introduce an improved double quantization detection method to improve the accuracy of JPEG image tampering detection. We evaluate our detection method using the publicly available CASIA authentic and tampered image data set of 9501 JPEG images. We carry out 20 rounds of experiments with stringent parameter setting placed on our detection method to demonstrate its robustness. Each round of classifier is generated from a unique, non-overlapping and small subset composing of 1/20 of the tampered and 1/72 of the authentic images, to obtain a training data set of about 100 images per class, with the rest of the 19/20 of the tampered and 71/72 of the authentic images used for testing. Through the experiments, we show an average improvement of 40.31% and 44.85% in the true negative (TN) rate and true positive (TP) rate, respectively, when compared with the current state-of-the-art method. The average TN and TP rates obtained from 20 rounds of experiments carried out using our detection method, are 90.81% and 76.95%, respectively. The experimental results show that our JPEG image forensics method can support a reliable large-scale digital image evidence authenticity verification with consistent good accuracy. The low training to testing data ratio also indicates that our method is robust in practical applications even with a relatively limited or small training data set available.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115460163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teeth segmentation is one of the important components in building an Automated Dental Identification System (ADIS). The extraction of the teeth from their corresponding dental radiographs is called teeth segmentation. Dental radiographs may suffer from poor teeth image quality, low contrast and uneven exposure that complicate the task of teeth segmentation. To achieve a good performance in segmentation, the teeth images are preprocessed by a two-step thresholding technique, which starts with an iterative thresholding followed by an adaptive thresholding to binarize the teeth images. Then, we propose to adapt the seam carving technique on the binary images, using both horizontal and vertical seams, to separate each individual tooth. The proposed method is evaluated experimentally and compared to other algorithms. The results show that our new approach achieves the lowest failure rate among all existing methods, and the highest optimality among all of the fully automated approaches reported in the literature.
{"title":"A New Approach to Teeth Segmentation","authors":"Nourdin Al-sherif, G. Guo, H. Ammar","doi":"10.1109/ISM.2012.35","DOIUrl":"https://doi.org/10.1109/ISM.2012.35","url":null,"abstract":"Teeth segmentation is one of the important components in building an Automated Dental Identification System (ADIS). The extraction of the teeth from their corresponding dental radiographs is called teeth segmentation. Dental radiographs may suffer from poor teeth image quality, low contrast and uneven exposure that complicate the task of teeth segmentation. To achieve a good performance in segmentation, the teeth images are preprocessed by a two-step thresholding technique, which starts with an iterative thresholding followed by an adaptive thresholding to binarize the teeth images. Then, we propose to adapt the seam carving technique on the binary images, using both horizontal and vertical seams, to separate each individual tooth. The proposed method is evaluated experimentally and compared to other algorithms. The results show that our new approach achieves the lowest failure rate among all existing methods, and the highest optimality among all of the fully automated approaches reported in the literature.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115639470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.
{"title":"Using Wavelets and Gaussian Mixture Models for Audio Classification","authors":"C. Chuan, S. Vasana, A. Asaithambi","doi":"10.1109/ISM.2012.86","DOIUrl":"https://doi.org/10.1109/ISM.2012.86","url":null,"abstract":"In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114772077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a novel recommendation method for photo-taking points from a large amount of social community photo collections. There are many research activities on photo-related recommendations from a lot of photos stored and managed by photo sharing web services, such as Flickr, Picas a and Panoramio, Although some methods, such as landmark recommendation, tag recommendation and photo recommendation have already been proposed, no photo-taking point recommendation methods have been realized yet for social photo collections. In order to realize photo-taking point recommendation, we introduce a novel point and photo selection method based on nested clustering. From our experiments, it is shown that better recommendation accuracy with our proposed method can be attained.
{"title":"Photo-Taking Point Recommendation with Nested Clustering","authors":"Kosuke Kimura, Hung-Hsuan Huang, K. Kawagoe","doi":"10.1109/ISM.2012.20","DOIUrl":"https://doi.org/10.1109/ISM.2012.20","url":null,"abstract":"In this paper, we propose a novel recommendation method for photo-taking points from a large amount of social community photo collections. There are many research activities on photo-related recommendations from a lot of photos stored and managed by photo sharing web services, such as Flickr, Picas a and Panoramio, Although some methods, such as landmark recommendation, tag recommendation and photo recommendation have already been proposed, no photo-taking point recommendation methods have been realized yet for social photo collections. In order to realize photo-taking point recommendation, we introduce a novel point and photo selection method based on nested clustering. From our experiments, it is shown that better recommendation accuracy with our proposed method can be attained.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124978230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic segmentation of images with low depth of field (DOF) plays an important role in content-based multimedia applications. The proposed approach aims to separate the important objects (i.e., interest regions) of a given image from its defocused background in two stages. In stage one, image blocks are classified into object and background blocks using a novel cluster ensemble algorithm. By indicating the certain pixels (seeds) of the object and background blocks, a hard constraint is provided for the next stage of the approach. In stage two, a minimal graph cut is constructed using object and background seeds, which is based on the max-flow method. Experimental results for a wide range of busy-texture (i.e., noisy) and smooth regions demonstrate that the proposed approach provides better segmentation performance at higher speed compared with the state-of-the-art approaches.
{"title":"Automatic Segmentation of Interest Regions in Low Depth of Field Images Using Ensemble Clustering and Graph Cut Optimization Approaches","authors":"Gholamreza Rafiee, S. Dlay, W. L. Woo","doi":"10.1109/ISM.2012.39","DOIUrl":"https://doi.org/10.1109/ISM.2012.39","url":null,"abstract":"Automatic segmentation of images with low depth of field (DOF) plays an important role in content-based multimedia applications. The proposed approach aims to separate the important objects (i.e., interest regions) of a given image from its defocused background in two stages. In stage one, image blocks are classified into object and background blocks using a novel cluster ensemble algorithm. By indicating the certain pixels (seeds) of the object and background blocks, a hard constraint is provided for the next stage of the approach. In stage two, a minimal graph cut is constructed using object and background seeds, which is based on the max-flow method. Experimental results for a wide range of busy-texture (i.e., noisy) and smooth regions demonstrate that the proposed approach provides better segmentation performance at higher speed compared with the state-of-the-art approaches.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"492 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123158385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a spatial error concealment algorithm for recovering the lost packets in still images or Intra-coded frames in video. The idea is based on the work proposed by Hsia [1], where a 1-D boundary matching is applied to recover the lost areas while preserving the edge information. However, we modify Hsia's algorithm in several aspects to improve its performance in terms of subjective and objective quality, i.e., smoothing operation by an averaging filter, adaptive searching range expansion, and matching direction crossing removing. Simulation results shows superiority of the proposed algorithm compared with Hsia's algorithm and some other state-of-the-art algorithms.
{"title":"Boundary Matching Based Spatial Interpolation for Consecutive Block Loss Concealment","authors":"Shaoshuai Gao","doi":"10.1109/ISM.2012.32","DOIUrl":"https://doi.org/10.1109/ISM.2012.32","url":null,"abstract":"This paper presents a spatial error concealment algorithm for recovering the lost packets in still images or Intra-coded frames in video. The idea is based on the work proposed by Hsia [1], where a 1-D boundary matching is applied to recover the lost areas while preserving the edge information. However, we modify Hsia's algorithm in several aspects to improve its performance in terms of subjective and objective quality, i.e., smoothing operation by an averaging filter, adaptive searching range expansion, and matching direction crossing removing. Simulation results shows superiority of the proposed algorithm compared with Hsia's algorithm and some other state-of-the-art algorithms.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126249413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Schreiber, M. Mühlhäuser, Aristotelis Hadjakos, Erwin Aitenbichler
Modern collaboration applications have a multitude of QoS requirements. Depending on the application module, such as messaging, collaborative modeling, or multimedia conferencing, the networking requirements are different. Such applications can greatly benefit from middleware which allows for configuring and using multiple communication stacks from within the same application. This enables applications to employ the best middleware configuration for each communication task. In this paper we demonstrate how to optimize for reliability, low latency, and throughput using configurable protocol stacks. We have implemented a prototype and evaluated the positive effects of customizing the protocol stack in three different use case scenarios.
{"title":"Configurable Middleware for Multimedia Collaboration Applications","authors":"D. Schreiber, M. Mühlhäuser, Aristotelis Hadjakos, Erwin Aitenbichler","doi":"10.1109/ISM.2012.67","DOIUrl":"https://doi.org/10.1109/ISM.2012.67","url":null,"abstract":"Modern collaboration applications have a multitude of QoS requirements. Depending on the application module, such as messaging, collaborative modeling, or multimedia conferencing, the networking requirements are different. Such applications can greatly benefit from middleware which allows for configuring and using multiple communication stacks from within the same application. This enables applications to employ the best middleware configuration for each communication task. In this paper we demonstrate how to optimize for reliability, low latency, and throughput using configurable protocol stacks. We have implemented a prototype and evaluated the positive effects of customizing the protocol stack in three different use case scenarios.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125695672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Reis, S. S. Borges, Vinicius H. S. Durelli, Luis Fernando de S. Moro, A. Brandão, E. Barbosa, L. O. Brandão, Seiji Isotani, P. Jaques, I. Bittencourt
The interface is the main mechanism of communication between user and system features. In educational software, successful user interface designs minimize the cognitive load on users, thereby users can direct their efforts to maximize their understanding of the educational concepts being presented. We investigated whether a reduced interface make few cognitive demands on users in comparison to a complete interface. In this context, this research aims at analyzing a reduced and a complete interface of an interactive geometry software, and verify the educational benefits they provide. To this end, we designed the interfaces and carried out an experiment involving 69 undergraduate students. The experimental results indicate that an interface that hides advanced and extraneous features helps novice users to perform slightly better than novice users using a complete interface. After receiving proper training, however, a complete interface makes users more productive than a reduced interface.
{"title":"Towards Reducing Cognitive Load and Enhancing Usability through a Reduced Graphical User Interface for a Dynamic Geometry System: An Experimental Study","authors":"H. Reis, S. S. Borges, Vinicius H. S. Durelli, Luis Fernando de S. Moro, A. Brandão, E. Barbosa, L. O. Brandão, Seiji Isotani, P. Jaques, I. Bittencourt","doi":"10.1109/ISM.2012.91","DOIUrl":"https://doi.org/10.1109/ISM.2012.91","url":null,"abstract":"The interface is the main mechanism of communication between user and system features. In educational software, successful user interface designs minimize the cognitive load on users, thereby users can direct their efforts to maximize their understanding of the educational concepts being presented. We investigated whether a reduced interface make few cognitive demands on users in comparison to a complete interface. In this context, this research aims at analyzing a reduced and a complete interface of an interactive geometry software, and verify the educational benefits they provide. To this end, we designed the interfaces and carried out an experiment involving 69 undergraduate students. The experimental results indicate that an interface that hides advanced and extraneous features helps novice users to perform slightly better than novice users using a complete interface. After receiving proper training, however, a complete interface makes users more productive than a reduced interface.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128147114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper makes multidimensional QoE comparisons of GUIs for threshold selection in the QoE-based video output scheme SCS in audio-video IP transmission. SCS switches between error concealment and frame skipping by comparing the ratio of video slice loss in a video frame with a threshold value. For the purpose of adapting SCS to individual users' inclination, we need to create some appropriate threshold selection interfaces and give the option to select a threshold value to the user. We propose two new interfaces: a slide bar method (choosing a threshold value by a slide bar), and a two mode method (selection of one out of two modes by two buttons). In the latter, a threshold value is set according to the mode. To assess QoE multidimensionally, we conducted subjective experiments on four methods: the conventional error concealment method (100% method), a previously proposed interface called the radio button method (selecting a threshold value by four radio buttons) and the two new interfaces. As a result, we observe that the two new interfaces can achieve higher overall audiovisual QoE than the conventional 100% method and the radio button method. In addition, we show that the slide bar method achieves the highest controllability whereas it imposes a burden on the user and that the two mode method imposes the lightest burden on the user among the three interfaces while it provides high QoE.
{"title":"QoE Enhancement by GUI for Threshold Selection in the QoE-Based Video Output Scheme SCS","authors":"Tomohiro Yokoi, S. Tasaka, Toshiro Nunome","doi":"10.1109/ISM.2012.52","DOIUrl":"https://doi.org/10.1109/ISM.2012.52","url":null,"abstract":"This paper makes multidimensional QoE comparisons of GUIs for threshold selection in the QoE-based video output scheme SCS in audio-video IP transmission. SCS switches between error concealment and frame skipping by comparing the ratio of video slice loss in a video frame with a threshold value. For the purpose of adapting SCS to individual users' inclination, we need to create some appropriate threshold selection interfaces and give the option to select a threshold value to the user. We propose two new interfaces: a slide bar method (choosing a threshold value by a slide bar), and a two mode method (selection of one out of two modes by two buttons). In the latter, a threshold value is set according to the mode. To assess QoE multidimensionally, we conducted subjective experiments on four methods: the conventional error concealment method (100% method), a previously proposed interface called the radio button method (selecting a threshold value by four radio buttons) and the two new interfaces. As a result, we observe that the two new interfaces can achieve higher overall audiovisual QoE than the conventional 100% method and the radio button method. In addition, we show that the slide bar method achieves the highest controllability whereas it imposes a burden on the user and that the two mode method imposes the lightest burden on the user among the three interfaces while it provides high QoE.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132748851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}