首页 > 最新文献

2012 IEEE International Symposium on Multimedia最新文献

英文 中文
A Smart Kitchen Infrastructure 智能厨房基础设施
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.27
Marcus Ständer, Aristotelis Hadjakos, Niklas Lochschmidt, Christian Klos, B. Renner, M. Mühlhäuser
In the future our homes will be more and more equipped with sensing and interaction devices that will make new multimedia experiences possible. These experiences will not necessarily be bound to the TV, tabletop, smart phone, tablet or desktop computer but will be embedded in our everyday surroundings. In order to enable new forms of interaction, we equipped an ordinary kitchen with a large variety of sensors according to best practices. An innovation in comparison to related work is our Information Acquisition System that allows monitoring and controlling kitchen appliances remotely. This paper presents our sensing infrastructure and novel interactions in the kitchen that are enabled by the Information Acquisition System.
在未来,我们的家庭将越来越多地配备传感和交互设备,这将使新的多媒体体验成为可能。这些体验不一定局限于电视、桌面、智能手机、平板电脑或台式电脑,而是会融入我们的日常环境。为了实现新的互动形式,我们根据最佳实践为一个普通的厨房配备了各种各样的传感器。与相关工作相比,我们的信息采集系统是一项创新,可以远程监控厨房电器。本文介绍了我们的传感基础设施和由信息采集系统实现的厨房中的新型交互。
{"title":"A Smart Kitchen Infrastructure","authors":"Marcus Ständer, Aristotelis Hadjakos, Niklas Lochschmidt, Christian Klos, B. Renner, M. Mühlhäuser","doi":"10.1109/ISM.2012.27","DOIUrl":"https://doi.org/10.1109/ISM.2012.27","url":null,"abstract":"In the future our homes will be more and more equipped with sensing and interaction devices that will make new multimedia experiences possible. These experiences will not necessarily be bound to the TV, tabletop, smart phone, tablet or desktop computer but will be embedded in our everyday surroundings. In order to enable new forms of interaction, we equipped an ordinary kitchen with a large variety of sensors according to best practices. An innovation in comparison to related work is our Information Acquisition System that allows monitoring and controlling kitchen appliances remotely. This paper presents our sensing infrastructure and novel interactions in the kitchen that are enabled by the Information Acquisition System.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129328519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
AudioAlign - Synchronization of A/V-Streams Based on Audio Data AudioAlign -基于音频数据的A/ v流同步
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.79
Mario Guggenberger, M. Lux, L. Böszörményi
Manual synchronization of audio and video recordings is a very annoying and time consuming task, especially if the tracks are very long and/or of large quantity. If the tracks aren't just short clips (of a few seconds or minutes) and recorded from heterogeneous sources, an additional problem comes into play - time drift - which arises if different recording devices aren't synchronized. This demo paper presents the experimental software Audio Align, which aims to simplify the manual synchronization process with the ultimate goal to automate it altogether. It gives a short introduction to the topic, discusses the approach, method, implementation and preliminary results and gives an outlook at possible improvements.
手动同步音频和视频记录是一个非常烦人和耗时的任务,特别是如果轨道很长和/或大量。如果音轨不只是短片段(几秒或几分钟),并且是从不同的来源录制的,那么一个额外的问题就会出现——时间漂移——如果不同的录制设备没有同步,就会出现时间漂移。本演示论文介绍了实验软件Audio Align,旨在简化手动同步过程,最终目标是完全自动化。简要介绍了本课题,讨论了方法、方法、实施和初步结果,并对可能的改进进行了展望。
{"title":"AudioAlign - Synchronization of A/V-Streams Based on Audio Data","authors":"Mario Guggenberger, M. Lux, L. Böszörményi","doi":"10.1109/ISM.2012.79","DOIUrl":"https://doi.org/10.1109/ISM.2012.79","url":null,"abstract":"Manual synchronization of audio and video recordings is a very annoying and time consuming task, especially if the tracks are very long and/or of large quantity. If the tracks aren't just short clips (of a few seconds or minutes) and recorded from heterogeneous sources, an additional problem comes into play - time drift - which arises if different recording devices aren't synchronized. This demo paper presents the experimental software Audio Align, which aims to simplify the manual synchronization process with the ultimate goal to automate it altogether. It gives a short introduction to the topic, discusses the approach, method, implementation and preliminary results and gives an outlook at possible improvements.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116056541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
JIRL - A C++ Library for JPEG Compressed Domain Image Retrieval 一个用于JPEG压缩域图像检索的c++库
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.48
David Edmundson, G. Schaefer
In this paper we present JIRL, an open source C++ software suite that allows to perform content-based image retrieval in the JPEG compressed domain. We provide implementations of nine retrieval algorithms representing the current state-of-the-art. For each algorithm, methods for compressed domain feature extraction as well as feature comparison are provided in an object-oriented framework. In addition, our software suite includes functionality for benchmarking retrieval algorithms in terms of retrieval performance and retrieval time. An example full image retrieval application is also provided to demonstrate how the library can be used. JIRL is made available to fellow researchers under the LGPL.
在本文中,我们提出了JIRL,一个开源的c++软件套件,允许在JPEG压缩域中执行基于内容的图像检索。我们提供了代表当前最先进技术的九种检索算法的实现。对于每种算法,在面向对象的框架中提供了压缩域特征提取和特征比较的方法。此外,我们的软件套件还包括在检索性能和检索时间方面对检索算法进行基准测试的功能。还提供了一个示例完整图像检索应用程序来演示如何使用该库。在LGPL下,JIRL可供其他研究人员使用。
{"title":"JIRL - A C++ Library for JPEG Compressed Domain Image Retrieval","authors":"David Edmundson, G. Schaefer","doi":"10.1109/ISM.2012.48","DOIUrl":"https://doi.org/10.1109/ISM.2012.48","url":null,"abstract":"In this paper we present JIRL, an open source C++ software suite that allows to perform content-based image retrieval in the JPEG compressed domain. We provide implementations of nine retrieval algorithms representing the current state-of-the-art. For each algorithm, methods for compressed domain feature extraction as well as feature comparison are provided in an object-oriented framework. In addition, our software suite includes functionality for benchmarking retrieval algorithms in terms of retrieval performance and retrieval time. An example full image retrieval application is also provided to demonstrate how the library can be used. JIRL is made available to fellow researchers under the LGPL.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124981340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
GPU Hierarchical Quilted Self Organizing Maps for Multimedia Understanding 面向多媒体理解的GPU分层绗缝自组织地图
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.102
Y. Nashed
It is well established that the human brain outperforms current computers, concerning pattern recognition tasks, through the collaborative processing of simple building units (neurons). In this work we expand an abstracted model of the neocortex called Hierarchical Quilted Self Organizing Map, benefiting from the parallel power of current Graphical Processing Units, to achieve realtime understanding and classification of spatio-temporal sensory information. We also propose an improvement on the original model that allows the learning rate to be automatically adapted according to the input training data available. The overall system is tested on the task of gesture recognition from a Microsoft Kinect publicly available dataset.
众所周知,在模式识别任务方面,人脑通过对简单构建单元(神经元)的协同处理,胜过当前的计算机。在这项工作中,我们扩展了一个被称为分层绗绗自组织地图的新皮层抽象模型,利用当前图形处理单元的并行能力,实现对时空感官信息的实时理解和分类。我们还提出了对原始模型的改进,允许学习率根据可用的输入训练数据自动调整。整个系统在微软Kinect公开数据集的手势识别任务上进行了测试。
{"title":"GPU Hierarchical Quilted Self Organizing Maps for Multimedia Understanding","authors":"Y. Nashed","doi":"10.1109/ISM.2012.102","DOIUrl":"https://doi.org/10.1109/ISM.2012.102","url":null,"abstract":"It is well established that the human brain outperforms current computers, concerning pattern recognition tasks, through the collaborative processing of simple building units (neurons). In this work we expand an abstracted model of the neocortex called Hierarchical Quilted Self Organizing Map, benefiting from the parallel power of current Graphical Processing Units, to achieve realtime understanding and classification of spatio-temporal sensory information. We also propose an improvement on the original model that allows the learning rate to be automatically adapted according to the input training data available. The overall system is tested on the task of gesture recognition from a Microsoft Kinect publicly available dataset.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126042133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ARtifact: Tablet-Based Augmented Reality for Interactive Analysis of Cultural Artifacts 文物:基于平板电脑的增强现实文化文物互动分析
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.17
D. Vanoni, M. Seracini, F. Kuester
To ensure the preservation of cultural heritage, artifacts such as paintings must be analyzed to diagnose physical frailties that could result in permanent damage. Advancements in digital imaging techniques and computer-aided analysis have greatly aided in such diagnoses but can limit the ability to work directly with the artifact in the field. This paper presents the implementation and application of ARtifact, a tablet-based augmented reality system that enables on-site visual analysis of the artifact in question. Utilizing real-time tracking of the artifact under observation, a user interacting with the tablet can study various layers of data registered with the physical object in situ. Theses layers, representing data acquired through various imaging modalities such as infrared thermography and ultraviolet fluorescence, provide the user with an augmented view of the artifact to aid in on-site diagnosis and restoration. Intuitive interaction techniques further enable targeted analysis of artifact-related data. We present a case study utilizing our tablet system to analyze a 16th century Italian hall and highlight the benefits of our approach.
为了确保文化遗产的保存,必须对绘画等文物进行分析,以诊断可能导致永久性损害的物理弱点。数字成像技术和计算机辅助分析的进步极大地帮助了这种诊断,但也限制了在该领域直接处理人工制品的能力。本文介绍了ARtifact的实现和应用,这是一个基于平板电脑的增强现实系统,可以对所讨论的工件进行现场可视化分析。利用对观察到的工件的实时跟踪,与平板电脑交互的用户可以研究与物理对象在原位注册的各种数据层。这些层代表了通过红外热成像和紫外线荧光等各种成像方式获得的数据,为用户提供了人工制品的增强视图,以帮助现场诊断和修复。直观的交互技术进一步支持对工件相关数据进行有针对性的分析。我们提出了一个案例研究,利用我们的平板系统来分析一个16世纪的意大利大厅,并强调我们的方法的好处。
{"title":"ARtifact: Tablet-Based Augmented Reality for Interactive Analysis of Cultural Artifacts","authors":"D. Vanoni, M. Seracini, F. Kuester","doi":"10.1109/ISM.2012.17","DOIUrl":"https://doi.org/10.1109/ISM.2012.17","url":null,"abstract":"To ensure the preservation of cultural heritage, artifacts such as paintings must be analyzed to diagnose physical frailties that could result in permanent damage. Advancements in digital imaging techniques and computer-aided analysis have greatly aided in such diagnoses but can limit the ability to work directly with the artifact in the field. This paper presents the implementation and application of ARtifact, a tablet-based augmented reality system that enables on-site visual analysis of the artifact in question. Utilizing real-time tracking of the artifact under observation, a user interacting with the tablet can study various layers of data registered with the physical object in situ. Theses layers, representing data acquired through various imaging modalities such as infrared thermography and ultraviolet fluorescence, provide the user with an augmented view of the artifact to aid in on-site diagnosis and restoration. Intuitive interaction techniques further enable targeted analysis of artifact-related data. We present a case study utilizing our tablet system to analyze a 16th century Italian hall and highlight the benefits of our approach.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"766 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132969869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Face Recognition Using Discrete Tchebichef-Krawtchouk Transform 基于离散Tchebichef-Krawtchouk变换的人脸识别
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.31
Wissam A. Jassim, Paramesran Raveendran
In this paper, a face recognition system based on Discrete Tchebichef-Krawtchouk Transform DTKT and Support Vector Machines SVMs is proposed. The objective of this paper is to present the following: (1) the mathematical and theoretical frameworks for the definition of the DTKT including transform equations that need to be addressed. (2) the DTKT features used in the classification of faces. (3) results of empirical tests that compare the representational capabilities of this transform with other types of discrete transforms such as Discrete Tchebichef transform DTT, discrete Krawtchouk Transform DKT, and Discrete Cosine transform DCT. The system is tested on a large number of faces collected from ORL and Yale face databases. Empirical results show that the proposed transform gives very good overall accuracy under clean and noisy conditions.
本文提出了一种基于离散Tchebichef-Krawtchouk变换和支持向量机svm的人脸识别系统。本文的目的是提出以下内容:(1)定义DTKT的数学和理论框架,包括需要解决的变换方程。(2)用于人脸分类的DTKT特征。(3)将该变换的表示能力与其他类型的离散变换(如离散Tchebichef变换DTT、离散Krawtchouk变换DKT和离散余弦变换DCT)进行比较的实证检验结果。该系统在从ORL和耶鲁人脸数据库中收集的大量人脸上进行了测试。实验结果表明,在清洁和噪声条件下,该变换具有很好的总体精度。
{"title":"Face Recognition Using Discrete Tchebichef-Krawtchouk Transform","authors":"Wissam A. Jassim, Paramesran Raveendran","doi":"10.1109/ISM.2012.31","DOIUrl":"https://doi.org/10.1109/ISM.2012.31","url":null,"abstract":"In this paper, a face recognition system based on Discrete Tchebichef-Krawtchouk Transform DTKT and Support Vector Machines SVMs is proposed. The objective of this paper is to present the following: (1) the mathematical and theoretical frameworks for the definition of the DTKT including transform equations that need to be addressed. (2) the DTKT features used in the classification of faces. (3) results of empirical tests that compare the representational capabilities of this transform with other types of discrete transforms such as Discrete Tchebichef transform DTT, discrete Krawtchouk Transform DKT, and Discrete Cosine transform DCT. The system is tested on a large number of faces collected from ORL and Yale face databases. Empirical results show that the proposed transform gives very good overall accuracy under clean and noisy conditions.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Energy Consumption Reduction via Context-Aware Mobile Video Pre-fetching 通过上下文感知移动视频预取降低能耗
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.56
A. Devlic, P. Lungaro, P. Kamaraju, Z. Segall, Konrad Tollmar
The arrival of smart phones and tablets, along with a flat rate mobile Internet pricing model have caused increasing adoption of mobile data services. According to recent studies, video has been the main driver of mobile data consumption, having a higher growth rate than any other mobile application. However, streaming a medium/high quality video files can be an issue in a mobile environment where available capacity needs to be shared among a large number of users. Additionally, the energy consumption in mobile devices increases proportionally with the duration of data transfers, which depend on the download data rates achievable by the device. In this respect, adoption of opportunistic content pre-fetching schemes that exploit times and locations with high data rates to deliver content before a user requests it, has the potential to reduce the energy consumption associated with content delivery and improve the user's quality of experience, by allowing playback of pre-stored content with virtually no perceived interruptions or delays. This paper presents a family of opportunistic content pre-fetching schemes and compares their performance to standard on-demand access to content. By adopting a simulation approach on experimental data, collected with monitoring software installed in mobile terminals, we show that content pre-fetching can reduce energy consumption of the mobile devices by up to 30% when compared to the on demand download of the same file, with a time window of 1 hour needed to complete the content prepositioning.
智能手机和平板电脑的出现,以及统一费率的移动互联网定价模式,促使人们越来越多地采用移动数据服务。根据最近的研究,视频已成为移动数据消费的主要驱动力,其增长率高于任何其他移动应用程序。然而,在需要在大量用户之间共享可用容量的移动环境中,流式传输中等/高质量视频文件可能是一个问题。此外,移动设备中的能耗随着数据传输的持续时间成比例地增加,这取决于设备可实现的下载数据速率。在这方面,采用机会主义的内容预获取方案,利用高数据速率的时间和地点,在用户请求之前交付内容,有可能减少与内容交付相关的能源消耗,并提高用户的体验质量,因为它允许在几乎没有感知到中断或延迟的情况下播放预先存储的内容。本文提出了一系列机会式内容预取方案,并将其性能与标准的点播访问内容进行了比较。通过对安装在移动终端上的监控软件采集的实验数据进行仿真,我们发现内容预取与同一文件的点播下载相比,可以减少移动设备高达30%的能耗,完成内容预取需要1小时的时间窗口。
{"title":"Energy Consumption Reduction via Context-Aware Mobile Video Pre-fetching","authors":"A. Devlic, P. Lungaro, P. Kamaraju, Z. Segall, Konrad Tollmar","doi":"10.1109/ISM.2012.56","DOIUrl":"https://doi.org/10.1109/ISM.2012.56","url":null,"abstract":"The arrival of smart phones and tablets, along with a flat rate mobile Internet pricing model have caused increasing adoption of mobile data services. According to recent studies, video has been the main driver of mobile data consumption, having a higher growth rate than any other mobile application. However, streaming a medium/high quality video files can be an issue in a mobile environment where available capacity needs to be shared among a large number of users. Additionally, the energy consumption in mobile devices increases proportionally with the duration of data transfers, which depend on the download data rates achievable by the device. In this respect, adoption of opportunistic content pre-fetching schemes that exploit times and locations with high data rates to deliver content before a user requests it, has the potential to reduce the energy consumption associated with content delivery and improve the user's quality of experience, by allowing playback of pre-stored content with virtually no perceived interruptions or delays. This paper presents a family of opportunistic content pre-fetching schemes and compares their performance to standard on-demand access to content. By adopting a simulation approach on experimental data, collected with monitoring software installed in mobile terminals, we show that content pre-fetching can reduce energy consumption of the mobile devices by up to 30% when compared to the on demand download of the same file, with a time window of 1 hour needed to complete the content prepositioning.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127519656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Detection and Identification of Chimpanzee Faces in the Wild 野生黑猩猩面孔的检测与识别
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.30
A. Loos, Andreas Ernst
In this paper, we present and evaluate a unified automatic image-based face detection and identification framework using two datasets of captive and free-living chimpanzee individuals gathered in uncontrolled environments. This application scenario implicates several challenging problems like different lighting situations, various expressions, partial occlusion, and non-cooperative subjects. After the faces and facial feature points are detected, we use a projective transformation to align the face images. All faces are then identified using an appearance-based face recognition approach in combination with additional information from local regions of the apes' face. We conducted open-set identification experiments for both datasets. Even though, the datasets are very challenging, the system achieved promising results and therefore has the potential to open up new ways in effective biodiversity conservation management.
在本文中,我们提出并评估了一个统一的基于图像的自动人脸检测和识别框架,该框架使用了在不受控制的环境中收集的圈养和自由生活的黑猩猩个体的两个数据集。这个应用场景涉及几个具有挑战性的问题,如不同的照明情况,各种表情,部分遮挡和非合作主体。在检测到人脸和人脸特征点后,我们使用投影变换对人脸图像进行对齐。然后使用基于外观的人脸识别方法结合来自猿面部局部区域的附加信息来识别所有的人脸。我们对两个数据集进行了开放集识别实验。尽管数据集非常具有挑战性,但该系统取得了令人鼓舞的结果,因此有可能为有效的生物多样性保护管理开辟新的途径。
{"title":"Detection and Identification of Chimpanzee Faces in the Wild","authors":"A. Loos, Andreas Ernst","doi":"10.1109/ISM.2012.30","DOIUrl":"https://doi.org/10.1109/ISM.2012.30","url":null,"abstract":"In this paper, we present and evaluate a unified automatic image-based face detection and identification framework using two datasets of captive and free-living chimpanzee individuals gathered in uncontrolled environments. This application scenario implicates several challenging problems like different lighting situations, various expressions, partial occlusion, and non-cooperative subjects. After the faces and facial feature points are detected, we use a projective transformation to align the face images. All faces are then identified using an appearance-based face recognition approach in combination with additional information from local regions of the apes' face. We conducted open-set identification experiments for both datasets. Even though, the datasets are very challenging, the system achieved promising results and therefore has the potential to open up new ways in effective biodiversity conservation management.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121224658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Exploiting JPEG Compression for Image Retrieval 利用JPEG压缩图像检索
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.99
David Edmundson, G. Schaefer
Content-based image retrieval (CBIR) has been an active research area for many years, yet much of the research ignores the fact that most images are stored in compressed form which affects retrieval both in terms of processing speed and retrieval accruacy. In this paper, we address various aspects of JPEG compressed images in the context of image retrieval. We first analyse the effect of JPEG quantisation on image retrieval and present a robust method to address the resulting performance drop. We then compare various retrieval methods that work in the JPEG compressed domain and finally propose two new methods that are based solely on information available in the JPEG header. One of these is using optimised Huffman tables for retrieval, while the other is based on tuned quantisation tables. Both techniques are shown to give retrieval performance comparable to existing methods while being magnitudes faster.
基于内容的图像检索(CBIR)是一个活跃的研究领域,但许多研究都忽略了一个事实,即大多数图像是以压缩形式存储的,这在处理速度和检索效率方面都会影响检索。在本文中,我们在图像检索的背景下讨论了JPEG压缩图像的各个方面。我们首先分析了JPEG量化对图像检索的影响,并提出了一种鲁棒的方法来解决由此导致的性能下降。然后,我们比较了在JPEG压缩域中工作的各种检索方法,最后提出了两种仅基于JPEG标头中可用信息的新方法。其中一种是使用优化的霍夫曼表进行检索,而另一种是基于调优的量化表。这两种技术都显示出与现有方法相当的检索性能,同时速度要快得多。
{"title":"Exploiting JPEG Compression for Image Retrieval","authors":"David Edmundson, G. Schaefer","doi":"10.1109/ISM.2012.99","DOIUrl":"https://doi.org/10.1109/ISM.2012.99","url":null,"abstract":"Content-based image retrieval (CBIR) has been an active research area for many years, yet much of the research ignores the fact that most images are stored in compressed form which affects retrieval both in terms of processing speed and retrieval accruacy. In this paper, we address various aspects of JPEG compressed images in the context of image retrieval. We first analyse the effect of JPEG quantisation on image retrieval and present a robust method to address the resulting performance drop. We then compare various retrieval methods that work in the JPEG compressed domain and finally propose two new methods that are based solely on information available in the JPEG header. One of these is using optimised Huffman tables for retrieval, while the other is based on tuned quantisation tables. Both techniques are shown to give retrieval performance comparable to existing methods while being magnitudes faster.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126194474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Effective Moving Object Detection and Retrieval via Integrating Spatial-Temporal Multimedia Information
Pub Date : 2012-12-10 DOI: 10.1109/ISM.2012.74
Dianting Liu, M. Shyu
In the area of multimedia semantic analysis and video retrieval, automatic object detection techniques play an important role. Without the analysis of the object-level features, it is hard to achieve high performance on semantic retrieval. As a branch of object detection study, moving object detection also becomes a hot research field and gets a great amount of progress recently. This paper proposes a moving object detection and retrieval model that integrates the spatial and temporal information in video sequences and uses the proposed integral density method (adopted from the idea of integral images) to quickly identify the motion regions in an unsupervised way. First, key information locations on video frames are achieved as maxima and minima of the result of Difference of Gaussian (DoG) function. On the other hand, a motion map of adjacent frames is obtained from the diversity of the outcomes from Simultaneous Partition and Class Parameter Estimation (SPCPE) framework. The motion map filters key information locations into key motion locations (KMLs) where the existence of moving objects is implied. Besides showing the motion zones, the motion map also indicates the motion direction which guides the proposed integral density approach to quickly and accurately locate the motion regions. The detection results are not only illustrated visually, but also verified by the promising experimental results which show the concept retrieval performance can be improved by integrating the global and local visual information.
在多媒体语义分析和视频检索领域,自动目标检测技术起着重要的作用。没有对对象级特征的分析,很难实现高性能的语义检索。运动目标检测作为目标检测研究的一个分支,近年来也成为一个研究热点并取得了很大的进展。本文提出了一种融合视频序列中时空信息的运动目标检测与检索模型,利用本文提出的积分密度方法(从积分图像的思想出发)以无监督的方式快速识别运动区域。首先,将视频帧上的关键信息定位为高斯差分(DoG)函数结果的最大值和最小值;另一方面,利用同步分割和类参数估计(sppe)框架结果的多样性,得到相邻帧的运动图。运动地图将关键信息位置过滤为暗示存在移动对象的关键运动位置(kml)。除了显示运动区域外,运动地图还显示运动方向,指导所提出的积分密度方法快速准确地定位运动区域。实验结果表明,将全局视觉信息与局部视觉信息相结合,可以提高概念检索的性能。
{"title":"Effective Moving Object Detection and Retrieval via Integrating Spatial-Temporal Multimedia Information","authors":"Dianting Liu, M. Shyu","doi":"10.1109/ISM.2012.74","DOIUrl":"https://doi.org/10.1109/ISM.2012.74","url":null,"abstract":"In the area of multimedia semantic analysis and video retrieval, automatic object detection techniques play an important role. Without the analysis of the object-level features, it is hard to achieve high performance on semantic retrieval. As a branch of object detection study, moving object detection also becomes a hot research field and gets a great amount of progress recently. This paper proposes a moving object detection and retrieval model that integrates the spatial and temporal information in video sequences and uses the proposed integral density method (adopted from the idea of integral images) to quickly identify the motion regions in an unsupervised way. First, key information locations on video frames are achieved as maxima and minima of the result of Difference of Gaussian (DoG) function. On the other hand, a motion map of adjacent frames is obtained from the diversity of the outcomes from Simultaneous Partition and Class Parameter Estimation (SPCPE) framework. The motion map filters key information locations into key motion locations (KMLs) where the existence of moving objects is implied. Besides showing the motion zones, the motion map also indicates the motion direction which guides the proposed integral density approach to quickly and accurately locate the motion regions. The detection results are not only illustrated visually, but also verified by the promising experimental results which show the concept retrieval performance can be improved by integrating the global and local visual information.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129978763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
2012 IEEE International Symposium on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1