André Klassen, Rüdiger Rolf, Lars Kiesow, Denis Meyer
Production and distribution of lecture related media, especially lecture recordings are becoming more and more important. To produce, distribute and manage this media several independent software packages are commonly used. There is no single place to go for lectures to control these processes. This way they have to get in contact with technical staff beforehand and whenever they want to change the distribution of the media or the media itself. This paper shows how to simplify this processes and how to give the lecturer more control over his recordings by presenting an example integration of the lecturer recording and distribution systems Lernfunk and Opencast Matter horn into the Lecture Management System Stud.IP.
{"title":"Integrating Production and Distribution of Lecture Related Media into an LMS","authors":"André Klassen, Rüdiger Rolf, Lars Kiesow, Denis Meyer","doi":"10.1109/ISM.2012.93","DOIUrl":"https://doi.org/10.1109/ISM.2012.93","url":null,"abstract":"Production and distribution of lecture related media, especially lecture recordings are becoming more and more important. To produce, distribute and manage this media several independent software packages are commonly used. There is no single place to go for lectures to control these processes. This way they have to get in contact with technical staff beforehand and whenever they want to change the distribution of the media or the media itself. This paper shows how to simplify this processes and how to give the lecturer more control over his recordings by presenting an example integration of the lecturer recording and distribution systems Lernfunk and Opencast Matter horn into the Lecture Management System Stud.IP.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128106454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parsing the structure of soccer video plays an important role in semantic analysis of soccer video. In this paper, we present a shot classification method based on the detection of grass field pixels and size of players. In addition, a replay detection algorithm is proposed. First, the candidate logo images are identified by using contrast feature and histogram difference. The contrast logo template is calculated to detect logo frames. Finally, replay segments are identified by pairing and finding the beginning and the end of logo transition. Experiments on three soccer matches showed that our method is effective and applicable for higher level semantic analysis.
{"title":"Shot Type and Replay Detection for Soccer Video Parsing","authors":"Ngoc Nguyen, A. Yoshitaka","doi":"10.1109/ISM.2012.69","DOIUrl":"https://doi.org/10.1109/ISM.2012.69","url":null,"abstract":"Parsing the structure of soccer video plays an important role in semantic analysis of soccer video. In this paper, we present a shot classification method based on the detection of grass field pixels and size of players. In addition, a replay detection algorithm is proposed. First, the candidate logo images are identified by using contrast feature and histogram difference. The contrast logo template is calculated to detect logo frames. Finally, replay segments are identified by pairing and finding the beginning and the end of logo transition. Experiments on three soccer matches showed that our method is effective and applicable for higher level semantic analysis.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121555328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elenilson Vieira da Silva Filho, Glauco de Sousa e Silva, Hugo Neves de Oliveira, Anderson Vinicius Alves Ferreira, Erick Melo, T. Tavares, G. Motta, Guido Lemos de Souza Filho
The constant need of sharing data in information systems leads to the development of more complex and creative solutions to the physical or cost limitations of the nowadays technology. The main problems of a distributed system include: the huge information volume by time interval carried over the network infrastructure, and the confidentiality of the ongoing data. Going into the media transmission sub area, there are even more restrictions to be considered. Error or delay, for example, can drastically impact the user experience in real-time transmission. In this context, this paper proposes a tool for performing efficient and secure distribution and encryption of video streams. This tool was implemented and applied in several contexts. In order to validate the tool in a set of possible situations, tests of the video reflector were made using several sets of parameters evolving variations of video codecs and presence or absence of cryptography.
{"title":"A Strategy of Multimedia Reflectors to Encryption and Codification in Real Time","authors":"Elenilson Vieira da Silva Filho, Glauco de Sousa e Silva, Hugo Neves de Oliveira, Anderson Vinicius Alves Ferreira, Erick Melo, T. Tavares, G. Motta, Guido Lemos de Souza Filho","doi":"10.1109/ISM.2012.59","DOIUrl":"https://doi.org/10.1109/ISM.2012.59","url":null,"abstract":"The constant need of sharing data in information systems leads to the development of more complex and creative solutions to the physical or cost limitations of the nowadays technology. The main problems of a distributed system include: the huge information volume by time interval carried over the network infrastructure, and the confidentiality of the ongoing data. Going into the media transmission sub area, there are even more restrictions to be considered. Error or delay, for example, can drastically impact the user experience in real-time transmission. In this context, this paper proposes a tool for performing efficient and secure distribution and encryption of video streams. This tool was implemented and applied in several contexts. In order to validate the tool in a set of possible situations, tests of the video reflector were made using several sets of parameters evolving variations of video codecs and presence or absence of cryptography.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123337890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Astronomical images are characterized by their smooth features, low level of Signal to Noise Ratio (SNR), and their extreme sensitivity to the motion of platform. Due to the low SNR, it is necessary to collect a large number of frames and consider the average. However, it is a common occurrence to have unregistered frames in the sequence. Frame registration using feature-based approach fails due to low contrast. Also, area-based approaches such as template matching and phase correlation methods, although accurate, suffer from computational inefficiency as a result of the large size and number of image frames in a sequence. This paper introduces a novel two-stage algorithm to accelerate the process of registration. The first stage projects the direction of movement as a cluster of parallel streaks and determines the angle of motion, using Linear Hough Transform. The next stage utilizes Normalized Cross Correlation only in the estimated direction to find the exact amount of displacement. Experimental results have been tabulated to illustrate superior computational efficiency of the proposed algorithm versus phase correlation, as well as robustness of the procedure in the presence of the noise.
{"title":"Sequential Image Registration for Astronomical Images","authors":"S. Shahhosseini, B. Rezaie, V. Emamian","doi":"10.1109/ISM.2012.65","DOIUrl":"https://doi.org/10.1109/ISM.2012.65","url":null,"abstract":"Astronomical images are characterized by their smooth features, low level of Signal to Noise Ratio (SNR), and their extreme sensitivity to the motion of platform. Due to the low SNR, it is necessary to collect a large number of frames and consider the average. However, it is a common occurrence to have unregistered frames in the sequence. Frame registration using feature-based approach fails due to low contrast. Also, area-based approaches such as template matching and phase correlation methods, although accurate, suffer from computational inefficiency as a result of the large size and number of image frames in a sequence. This paper introduces a novel two-stage algorithm to accelerate the process of registration. The first stage projects the direction of movement as a cluster of parallel streaks and determines the angle of motion, using Linear Hough Transform. The next stage utilizes Normalized Cross Correlation only in the estimated direction to find the exact amount of displacement. Experimental results have been tabulated to illustrate superior computational efficiency of the proposed algorithm versus phase correlation, as well as robustness of the procedure in the presence of the noise.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132257390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online Laboratories and Virtual Experiments start to play an increasingly important role in the education of Engineering and Science Education. While several repositories for online and virtual experiments are available, a common method for annotating experiments to simplify their discovery is not yet available and accepted. In 2010, an international group of online lab providers formed the Global Online Lab Consortium (GOLC) to address the issues of interoperability between online laboratories and laboratory compilations, one of its activities is the establishment of an ontology and a common metadata set that addresses not only the needs of typical lab providers and lab users, but also of storage and archival institutions such as libraries. This article describes the current status of the GOLC activities in the metadata subcommittee, lists the requirements of various user groups of the metadata set and provides insight into both the underlying ontology and the metadata specifications themselves.
{"title":"A Standardized Metadata Set for Annotation of Virtual and Remote Laboratories","authors":"T. Richter, P. Grube, D. Zutin","doi":"10.1109/ISM.2012.92","DOIUrl":"https://doi.org/10.1109/ISM.2012.92","url":null,"abstract":"Online Laboratories and Virtual Experiments start to play an increasingly important role in the education of Engineering and Science Education. While several repositories for online and virtual experiments are available, a common method for annotating experiments to simplify their discovery is not yet available and accepted. In 2010, an international group of online lab providers formed the Global Online Lab Consortium (GOLC) to address the issues of interoperability between online laboratories and laboratory compilations, one of its activities is the establishment of an ontology and a common metadata set that addresses not only the needs of typical lab providers and lab users, but also of storage and archival institutions such as libraries. This article describes the current status of the GOLC activities in the metadata subcommittee, lists the requirements of various user groups of the metadata set and provides insight into both the underlying ontology and the metadata specifications themselves.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114442342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this modern era of entertainment, content based multimedia services for general purpose users are extending in various dimensions. Earlier, the users were capable of using only local applications and some internet based applications with restricted privileges. The use of Universal Plug and Play Audio Visual (UPnP-AV) standards [1], HTML5 support in browsers, webrtc standards, and extended standards specially for mobiles devices will bring into play more and more Cloud Multimedia Services. However the requirements implicitly demand more interaction and participation in commonly driven activities giving a universal, unified experience. In this paper a solution for this challenge is proposed with the usage of context profile on diverse devices for adaptive services from cloud server using transcoding framework. The context profiling would be governing the transcoding framework algorithms used for adaptive or customized output from cloud. A more suitable cloud architecture from the multimedia service provider point of view is proposed.
在这个现代娱乐时代,面向一般用户的基于内容的多媒体服务正在向各个维度扩展。早些时候,用户只能使用本地应用程序和一些基于互联网的应用程序,权限有限。通用即插即用视听(Universal Plug and Play Audio Visual, UPnP-AV)标准[1]的使用、浏览器对HTML5的支持、webtc标准以及专门针对移动设备的扩展标准,将使越来越多的云多媒体服务发挥作用。然而,这些需求隐含地要求在共同驱动的活动中进行更多的互动和参与,从而提供普遍、统一的体验。本文提出了一种解决方案,即在不同设备上使用上下文配置文件,使用转码框架自适应云服务器的服务。上下文分析将管理用于自适应或自定义云输出的转码框架算法。从多媒体服务提供商的角度提出了一种更合适的云架构。
{"title":"Context Profiling Based Multimedia Service on Cloud","authors":"A. Narula, Kaustubh R. Joshi","doi":"10.1109/ISM.2012.81","DOIUrl":"https://doi.org/10.1109/ISM.2012.81","url":null,"abstract":"In this modern era of entertainment, content based multimedia services for general purpose users are extending in various dimensions. Earlier, the users were capable of using only local applications and some internet based applications with restricted privileges. The use of Universal Plug and Play Audio Visual (UPnP-AV) standards [1], HTML5 support in browsers, webrtc standards, and extended standards specially for mobiles devices will bring into play more and more Cloud Multimedia Services. However the requirements implicitly demand more interaction and participation in commonly driven activities giving a universal, unified experience. In this paper a solution for this challenge is proposed with the usage of context profile on diverse devices for adaptive services from cloud server using transcoding framework. The context profiling would be governing the transcoding framework algorithms used for adaptive or customized output from cloud. A more suitable cloud architecture from the multimedia service provider point of view is proposed.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132109870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.
{"title":"Learning Multiple Sequence-Based Kernels for Video Concept Detection","authors":"W. Bailer","doi":"10.1109/ISM.2012.22","DOIUrl":"https://doi.org/10.1109/ISM.2012.22","url":null,"abstract":"Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"Suppl 33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133735522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The coverage of the semantic gap in video indexing and retrieval has gone through a continuous increase of the vocabulary of high - level features or semantic descriptors, sometimes organized in light - scale, corpus - specific, computational ontologies. This paper presents a computer - supported manual annotation method that relies on a very large scale, shared, commonsense ontologies for the selection of semantic descriptors. The ontological terms are accessed through a linguistic interface that relies on multi - lingual dictionaries and action/event template structures (or frames). The manual generation or check of annotations provides ground truth data for evaluation purposes and training data for knowledge acquisition. The novelty of the approach relies on the use of widely shared large - scale ontologies, that prevent arbitrariness of annotation and favor interoperability. We test the viability of the approach by carrying out some user studies on the annotation of narrative videos.
{"title":"Commonsense Knowledge for the Collection of Ground Truth Data on Semantic Descriptors","authors":"V. Lombardo, R. Damiano","doi":"10.1109/ISM.2012.23","DOIUrl":"https://doi.org/10.1109/ISM.2012.23","url":null,"abstract":"The coverage of the semantic gap in video indexing and retrieval has gone through a continuous increase of the vocabulary of high - level features or semantic descriptors, sometimes organized in light - scale, corpus - specific, computational ontologies. This paper presents a computer - supported manual annotation method that relies on a very large scale, shared, commonsense ontologies for the selection of semantic descriptors. The ontological terms are accessed through a linguistic interface that relies on multi - lingual dictionaries and action/event template structures (or frames). The manual generation or check of annotations provides ground truth data for evaluation purposes and training data for knowledge acquisition. The novelty of the approach relies on the use of widely shared large - scale ontologies, that prevent arbitrariness of annotation and favor interoperability. We test the viability of the approach by carrying out some user studies on the annotation of narrative videos.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"9 Suppl 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133219821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manual counting of viral plaques is a tedious and labor-intensive process. In this paper, an efficient and economical method is proposed for automating viral plaque counting via image segmentation and various morphological operations. The method first segments a plate image into individual well images. Then, it converts each well image into a binary image and creates a new image by merging the dilated binary image and the complement image of the eroded binary image. At last, the contour hierarchy of the merged image is obtained and the plaque count is calculated by evaluating each outer contour count and its inner contour counts. Experiment results showed that the counting accuracy for the proposed method is up to 90 percent and the average processing time for a single image is about one second. An open source implementation with optional graphical user interface is available for public use.
{"title":"Automated Viral Plaque Counting Using Image Segmentation and Morphological Analysis","authors":"Michael Moorman, Aijuan Dong","doi":"10.1109/ISM.2012.38","DOIUrl":"https://doi.org/10.1109/ISM.2012.38","url":null,"abstract":"Manual counting of viral plaques is a tedious and labor-intensive process. In this paper, an efficient and economical method is proposed for automating viral plaque counting via image segmentation and various morphological operations. The method first segments a plate image into individual well images. Then, it converts each well image into a binary image and creates a new image by merging the dilated binary image and the complement image of the eroded binary image. At last, the contour hierarchy of the merged image is obtained and the plaque count is calculated by evaluating each outer contour count and its inner contour counts. Experiment results showed that the counting accuracy for the proposed method is up to 90 percent and the average processing time for a single image is about one second. An open source implementation with optional graphical user interface is available for public use.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130393186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Algorithmic image quality metrics have been based on the assumption that an image is only distorted by a single distortion type at a time. The performance of the current metrics is low if image concurrently includes more than one distortion. The aim of this study was to find efficient feature sets for predicting visual quality of real photographs which are subjected to many different distortion sources and types. Features should support each other and function with many concurrent image distortions. We used correlation based feature selector method and image database created with various digital cameras for feature selection. Based on the study the results are promising. Our general and scene-specific feature combinations correlate well with the human observations compared to the state-of-the-art metrics.
{"title":"Features for Predicting Quality of Images Captured by Digital Cameras","authors":"M. Nuutinen, P. Oittinen, T. Virtanen","doi":"10.1109/ISM.2012.40","DOIUrl":"https://doi.org/10.1109/ISM.2012.40","url":null,"abstract":"Algorithmic image quality metrics have been based on the assumption that an image is only distorted by a single distortion type at a time. The performance of the current metrics is low if image concurrently includes more than one distortion. The aim of this study was to find efficient feature sets for predicting visual quality of real photographs which are subjected to many different distortion sources and types. Features should support each other and function with many concurrent image distortions. We used correlation based feature selector method and image database created with various digital cameras for feature selection. Based on the study the results are promising. Our general and scene-specific feature combinations correlate well with the human observations compared to the state-of-the-art metrics.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128794157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}