首页 > 最新文献

2015 IEEE International Symposium on Multimedia (ISM)最新文献

英文 中文
Personalized Indexing of Attention in Lectures -- Requirements and Concept 讲座中注意力的个性化索引——要求与概念
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.44
Sebastian Pospiech, N. Birnbaum, L. Knipping, R. Mertens
Web lectures can be employed in a variety of didactic scenarios ranging from add-on for a live lecture to stand-alone learning content. In all of these scenarios, though less in the stand-alone one, indexing and navigation are crucial for real world usability. As a consequence, many approaches like slide based indexing, transcript based indexing, collaborative manual indexing as well as individual or social indexing based on viewing behavior have been devised. The approach proposed in this paper takes individual indexing based on viewing behavior two steps further in that (a) indexes the recording at production time in the lecture hall and (b) actively analyzes the students attention focus instead of passively recording viewing time as done in conventional footprinting. In order to track student attention during the lecture, recoding and analyzing the student's behaviour in parallel to the lecture as well as synchronizing both data streams is necessary. This paper discusses the architecture required for personalized attention based indexing, possible problems and strategies to tackle them.
网络讲座可以用于各种教学场景,从现场讲座的附加内容到独立的学习内容。在所有这些场景中,索引和导航对于现实世界的可用性至关重要,尽管在独立场景中较少。因此,许多方法,如基于幻灯片的索引,基于文本的索引,协作手动索引以及基于观看行为的个人或社会索引已经被设计出来。本文提出的方法将基于观看行为的个体索引向前推进了两步,即(a)在演讲厅的录制时间索引,(b)主动分析学生的注意力焦点,而不是像传统的足迹那样被动记录观看时间。为了在讲座期间跟踪学生的注意力,在讲课的同时重新编码和分析学生的行为以及同步两个数据流是必要的。本文讨论了个性化关注索引所需的体系结构、可能存在的问题以及解决这些问题的策略。
{"title":"Personalized Indexing of Attention in Lectures -- Requirements and Concept","authors":"Sebastian Pospiech, N. Birnbaum, L. Knipping, R. Mertens","doi":"10.1109/ISM.2015.44","DOIUrl":"https://doi.org/10.1109/ISM.2015.44","url":null,"abstract":"Web lectures can be employed in a variety of didactic scenarios ranging from add-on for a live lecture to stand-alone learning content. In all of these scenarios, though less in the stand-alone one, indexing and navigation are crucial for real world usability. As a consequence, many approaches like slide based indexing, transcript based indexing, collaborative manual indexing as well as individual or social indexing based on viewing behavior have been devised. The approach proposed in this paper takes individual indexing based on viewing behavior two steps further in that (a) indexes the recording at production time in the lecture hall and (b) actively analyzes the students attention focus instead of passively recording viewing time as done in conventional footprinting. In order to track student attention during the lecture, recoding and analyzing the student's behaviour in parallel to the lecture as well as synchronizing both data streams is necessary. This paper discusses the architecture required for personalized attention based indexing, possible problems and strategies to tackle them.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123122036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Employing Sensors and Services Fusion to Detect and Assess Driving Events 利用传感器和服务融合检测和评估驾驶事件
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.121
Seyed Vahid Hosseinioun, Hussein Al Osman, Abdulmotaleb El Saddik
With the remarkable increase in use of sensors in our daily lives, various methods have been devised to detect events in a driving environment using smart-phones as they provide two main advantages: they eliminate the need to have dedicated hardware in vehicles and they are widely accessible. Since rewarding safe driving is an important issue for insurance companies, some companies are implementing Usage-Based Insurance (UBI) as opposed to traditional History-Based plans. The collection of driving events, such as acceleration and turning, is a prerequisite requirement for the adoption of such plans. Mobile phone sensors are capable of detecting whether a car is accelerating or braking, while through service fusion we can detect other events like speeding or instances of severe weather. We propose a new and robust hybrid classification algorithm that detects acceleration-based events with an F1-score of 0.9304 and turn events with an F1-score of 0.9038. We further propose a method for measuring the driving performance index using the detected events.
随着传感器在我们日常生活中的使用显著增加,人们设计了各种方法来使用智能手机检测驾驶环境中的事件,因为它们有两个主要优势:它们消除了对车辆专用硬件的需求,而且它们很容易获得。由于奖励安全驾驶是保险公司的重要问题,一些公司正在实施基于使用情况的保险(UBI),而不是传统的基于历史的计划。收集驾驶事件,如加速和转弯,是采用这种计划的先决条件。手机传感器能够检测汽车是否在加速或刹车,而通过服务融合,我们可以检测超速或恶劣天气等其他事件。我们提出了一种新的鲁棒混合分类算法,该算法可以检测f1得分为0.9304的基于加速的事件和f1得分为0.9038的转弯事件。我们进一步提出了一种利用检测到的事件来测量驾驶性能指标的方法。
{"title":"Employing Sensors and Services Fusion to Detect and Assess Driving Events","authors":"Seyed Vahid Hosseinioun, Hussein Al Osman, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2015.121","DOIUrl":"https://doi.org/10.1109/ISM.2015.121","url":null,"abstract":"With the remarkable increase in use of sensors in our daily lives, various methods have been devised to detect events in a driving environment using smart-phones as they provide two main advantages: they eliminate the need to have dedicated hardware in vehicles and they are widely accessible. Since rewarding safe driving is an important issue for insurance companies, some companies are implementing Usage-Based Insurance (UBI) as opposed to traditional History-Based plans. The collection of driving events, such as acceleration and turning, is a prerequisite requirement for the adoption of such plans. Mobile phone sensors are capable of detecting whether a car is accelerating or braking, while through service fusion we can detect other events like speeding or instances of severe weather. We propose a new and robust hybrid classification algorithm that detects acceleration-based events with an F1-score of 0.9304 and turn events with an F1-score of 0.9038. We further propose a method for measuring the driving performance index using the detected events.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120960570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Exploring the Complementarity of Audio-Visual Structural Regularities for the Classification of Videos into TV-Program Collections 视像分类为电视节目集的视听结构规律互补性探讨
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.133
G. Sargent, P. Hanna, H. Nicolas, F. Bimbot
This article proposes to analyze the structural regularities from the audio and video streams of TV-programs and explore their potential for the classification of videos into program collections. Our approach is based on the spectral analysis of distance matrices representing the short-and long-term dependancies within the audio and visual modalities of a video. We propose to compare two videos by their respective spectral features. We appreciate the benefits brought by the two modalities on the performances in the context of a K-nearest neighbor classification, and we test our approach in the context of an unsupervised clustering algorithm. These evaluations are performed on two datasets of French and Italian TV programs.
本文提出从电视节目的音视频流中分析其结构规律,探讨其在将视频分类为节目集方面的潜力。我们的方法是基于距离矩阵的频谱分析,表示视频的音频和视觉模式中的短期和长期依赖关系。我们建议通过各自的光谱特征来比较两个视频。我们意识到这两种模式在k近邻分类环境下对性能带来的好处,并且我们在无监督聚类算法的环境中测试了我们的方法。这些评估是在法语和意大利语电视节目的两个数据集上进行的。
{"title":"Exploring the Complementarity of Audio-Visual Structural Regularities for the Classification of Videos into TV-Program Collections","authors":"G. Sargent, P. Hanna, H. Nicolas, F. Bimbot","doi":"10.1109/ISM.2015.133","DOIUrl":"https://doi.org/10.1109/ISM.2015.133","url":null,"abstract":"This article proposes to analyze the structural regularities from the audio and video streams of TV-programs and explore their potential for the classification of videos into program collections. Our approach is based on the spectral analysis of distance matrices representing the short-and long-term dependancies within the audio and visual modalities of a video. We propose to compare two videos by their respective spectral features. We appreciate the benefits brought by the two modalities on the performances in the context of a K-nearest neighbor classification, and we test our approach in the context of an unsupervised clustering algorithm. These evaluations are performed on two datasets of French and Italian TV programs.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116119635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Two Pass Rate Control Scheme for Variable Bit Rate Video Streaming 一种新的可变比特率视频流的双通率控制方案
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.32
M. VenkataPhaniKumar, K. C. R. C. Varma, S. Mahapatra
In this paper, a novel two-pass rate control scheme is proposed to achieve a consistent visual quality media for variable bit rate (VBR) video streaming. The rate-distortion (RD) characteristics of each frame is used to establish a frame complexity model, which is later used along with statistics collected in the first-pass to derive an optimal quantization parameter for encoding the frame in the second-pass. The experimental results demonstrate that the proposed rate control scheme significantly outperforms the existing rate control mechanism in the Joint Model (JM) reference software in terms of the Peak Signal to Noise Ratio (PSNR) and consistent perceptual visual quality while achieving the target bit rate. Further, the proposed scheme is validated through implementation on a miniature test-bed.
本文提出了一种新的双通率控制方案,以实现可变比特率视频流的一致视觉质量。利用每一帧的率失真(RD)特征建立帧复杂度模型,然后将该模型与第一帧收集的统计数据结合使用,得出第二帧编码的最佳量化参数。实验结果表明,在达到目标比特率的同时,所提出的速率控制方案在峰值信噪比(PSNR)和一致的感知视觉质量方面明显优于Joint Model (JM)参考软件中现有的速率控制机制。最后,在一个小型试验台上对该方案进行了验证。
{"title":"A Novel Two Pass Rate Control Scheme for Variable Bit Rate Video Streaming","authors":"M. VenkataPhaniKumar, K. C. R. C. Varma, S. Mahapatra","doi":"10.1109/ISM.2015.32","DOIUrl":"https://doi.org/10.1109/ISM.2015.32","url":null,"abstract":"In this paper, a novel two-pass rate control scheme is proposed to achieve a consistent visual quality media for variable bit rate (VBR) video streaming. The rate-distortion (RD) characteristics of each frame is used to establish a frame complexity model, which is later used along with statistics collected in the first-pass to derive an optimal quantization parameter for encoding the frame in the second-pass. The experimental results demonstrate that the proposed rate control scheme significantly outperforms the existing rate control mechanism in the Joint Model (JM) reference software in terms of the Peak Signal to Noise Ratio (PSNR) and consistent perceptual visual quality while achieving the target bit rate. Further, the proposed scheme is validated through implementation on a miniature test-bed.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133164391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Multi-training Framework of Image Deep Learning on GPU Cluster 基于GPU集群的图像深度学习高效多训练框架
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.119
Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin
In this paper, we develop a pipelining schema for image deep learning on GPU cluster to leverage heavy workload of training procedure. In addition, it is usually necessary to train multiple models to obtain a good deep learning model due to the limited a priori knowledge on deep neural network structure. Therefore, adopting parallel and distributed computing appears is an obvious path forward, but the mileage varies depending on how amenable a deep network can be parallelized and the availability of rapid prototyping capabilities with low cost of entry. In this work, we propose a framework to organize the training procedures of multiple deep learning models into a pipeline on a GPU cluster, where each stage is handled by a particular GPU with a partition of the training dataset. Instead of frequently migrating data among the disks, CPUs, and GPUs, our framework only moves partially trained models to reduce bandwidth consumption and to leverage the full computation capability of the cluster. In this paper, we deploy the proposed framework on popular image recognition tasks using deep learning, and the experiments show that the proposed method reduces overall training time up to dozens of hours compared to the baseline method.
在本文中,我们开发了一种基于GPU集群的图像深度学习的流水线模式,以利用繁重的训练过程。此外,由于深度神经网络结构的先验知识有限,通常需要训练多个模型才能获得良好的深度学习模型。因此,采用并行和分布式计算似乎是一条显而易见的前进道路,但具体进展取决于深度网络的并行化程度,以及低入门成本的快速原型功能的可用性。在这项工作中,我们提出了一个框架,将多个深度学习模型的训练过程组织到GPU集群上的管道中,其中每个阶段由具有训练数据集分区的特定GPU处理。我们的框架没有在磁盘、cpu和gpu之间频繁地迁移数据,而是只移动部分训练好的模型,以减少带宽消耗并利用集群的全部计算能力。在本文中,我们使用深度学习将所提出的框架部署在流行的图像识别任务上,实验表明,与基线方法相比,所提出的方法可以减少总体训练时间长达数十小时。
{"title":"Efficient Multi-training Framework of Image Deep Learning on GPU Cluster","authors":"Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin","doi":"10.1109/ISM.2015.119","DOIUrl":"https://doi.org/10.1109/ISM.2015.119","url":null,"abstract":"In this paper, we develop a pipelining schema for image deep learning on GPU cluster to leverage heavy workload of training procedure. In addition, it is usually necessary to train multiple models to obtain a good deep learning model due to the limited a priori knowledge on deep neural network structure. Therefore, adopting parallel and distributed computing appears is an obvious path forward, but the mileage varies depending on how amenable a deep network can be parallelized and the availability of rapid prototyping capabilities with low cost of entry. In this work, we propose a framework to organize the training procedures of multiple deep learning models into a pipeline on a GPU cluster, where each stage is handled by a particular GPU with a partition of the training dataset. Instead of frequently migrating data among the disks, CPUs, and GPUs, our framework only moves partially trained models to reduce bandwidth consumption and to leverage the full computation capability of the cluster. In this paper, we deploy the proposed framework on popular image recognition tasks using deep learning, and the experiments show that the proposed method reduces overall training time up to dozens of hours compared to the baseline method.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125674733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A User-Based Framework for Group Re-Identification in Still Images 基于用户的静态图像群体再识别框架
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.41
Nestor Z. Salamon, Julio C. S. Jacques Junior, S. Musse
In this work we propose a framework for group re-identification based on manually defined soft-biometric characteristics. Users are able to choose colors that describe the soft-biometric attributes of each person belonging to the searched group. Our technique matches these structured attributes against image databases using color distance metrics, a novel adaptive threshold selection and people's proximity high level feature. Experimental results show that the proposed approach is able to help the re-identification procedure ranking the most likely results without training data, and also being extensible to work without previous images.
在这项工作中,我们提出了一个基于手动定义的软生物特征的群体再识别框架。用户可以选择描述属于搜索组的每个人的软生物特征属性的颜色。我们的技术使用颜色距离度量、一种新的自适应阈值选择和人的接近高水平特征将这些结构化属性与图像数据库相匹配。实验结果表明,该方法能够在没有训练数据的情况下对最可能的结果进行排序,并且可以扩展到没有先验图像的情况下进行再识别。
{"title":"A User-Based Framework for Group Re-Identification in Still Images","authors":"Nestor Z. Salamon, Julio C. S. Jacques Junior, S. Musse","doi":"10.1109/ISM.2015.41","DOIUrl":"https://doi.org/10.1109/ISM.2015.41","url":null,"abstract":"In this work we propose a framework for group re-identification based on manually defined soft-biometric characteristics. Users are able to choose colors that describe the soft-biometric attributes of each person belonging to the searched group. Our technique matches these structured attributes against image databases using color distance metrics, a novel adaptive threshold selection and people's proximity high level feature. Experimental results show that the proposed approach is able to help the re-identification procedure ranking the most likely results without training data, and also being extensible to work without previous images.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design and Development of a Cloud Based Cyber-Physical Architecture for the Internet-of-Things 基于云的物联网网络物理架构的设计与开发
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.96
K. M. Alam, Alex Sopena, Abdulmotaleb El Saddik
Internet-of-Things (IoT) is considered as the next big disruptive technology field which main goal is to achieve social good by enabling collaboration among physical things or sensors. We present a cloud based cyber-physical architecture to leverage the Sensing as-a-Service (SenAS) model, where every physical thing is complemented by a cloud based twin cyber process. In this model, things can communicate using direct physical connections or through the cyber layer using peer-to-peer inter process communications. The proposed model offers simultaneous communication channels among groups of things by uniquely tagging each group with a relationship ID. An intelligent service layer ensures custom privacy and access rights management for the sensor owners. We also present the implementation details of an IoT platform and demonstrate its practicality by developing case study applications for the Internet-of-Vehicles (IoV) and the connected smart home.
物联网(IoT)被认为是下一个大的颠覆性技术领域,其主要目标是通过实现物理事物或传感器之间的协作来实现社会公益。我们提出了一个基于云的网络物理架构,以利用传感即服务(SenAS)模型,其中每个物理事物都由基于云的双网络过程补充。在这个模型中,事物可以使用直接的物理连接或通过网络层使用点对点进程间通信进行通信。该模型通过对每组进行关系ID的唯一标记,提供了事物组之间的同步通信通道。智能服务层确保传感器所有者的自定义隐私和访问权限管理。我们还介绍了物联网平台的实施细节,并通过开发车联网(IoV)和联网智能家居的案例研究应用来展示其实用性。
{"title":"Design and Development of a Cloud Based Cyber-Physical Architecture for the Internet-of-Things","authors":"K. M. Alam, Alex Sopena, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2015.96","DOIUrl":"https://doi.org/10.1109/ISM.2015.96","url":null,"abstract":"Internet-of-Things (IoT) is considered as the next big disruptive technology field which main goal is to achieve social good by enabling collaboration among physical things or sensors. We present a cloud based cyber-physical architecture to leverage the Sensing as-a-Service (SenAS) model, where every physical thing is complemented by a cloud based twin cyber process. In this model, things can communicate using direct physical connections or through the cyber layer using peer-to-peer inter process communications. The proposed model offers simultaneous communication channels among groups of things by uniquely tagging each group with a relationship ID. An intelligent service layer ensures custom privacy and access rights management for the sensor owners. We also present the implementation details of an IoT platform and demonstrate its practicality by developing case study applications for the Internet-of-Vehicles (IoV) and the connected smart home.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131654296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An Unified Image Tagging System Driven by Image-Click-Ads Framework 基于图像-点击-广告框架的统一图像标签系统
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.12
Qiong Wu, P. Boulanger
With the exponential growth of web image data, image tagging is becoming crucial in many image based applications such as object recognition and content-based image retrieval. Despite the great progress achieved in automatic recognition technologies, none has yet provided a satisfactory solution to be widely useful in solving generic image recognition problems. So far, only manual tagging can provide reliable tagging results. However, such work is tedious, costly and workers have no motivation. In this paper, we propose an online image tagging system, EyeDentifyIt, driven by image-click-ads framework, which motivates crowdsourcing workers as well as general web users to tag images at high quality for low cost with low workload. A series of usability studies are presented to demonstrate how EyeDentifyIt provides improved user motivations and requires less workload, compared to state-of-the-art approaches.
随着网络图像数据的指数级增长,图像标注在许多基于图像的应用中变得至关重要,例如物体识别和基于内容的图像检索。尽管自动识别技术取得了很大的进步,但还没有一种令人满意的解决方案能够广泛应用于解决一般的图像识别问题。到目前为止,只有手动标注才能提供可靠的标注结果。然而,这样的工作是乏味的,昂贵的,工人没有动力。在本文中,我们提出了一个在线图像标记系统,EyeDentifyIt,由图像-点击-广告框架驱动,激励众包工作者和一般网络用户以低成本、低工作量的方式标记高质量的图像。一系列的可用性研究展示了EyeDentifyIt如何提供更好的用户动机,并且与最先进的方法相比,需要更少的工作量。
{"title":"An Unified Image Tagging System Driven by Image-Click-Ads Framework","authors":"Qiong Wu, P. Boulanger","doi":"10.1109/ISM.2015.12","DOIUrl":"https://doi.org/10.1109/ISM.2015.12","url":null,"abstract":"With the exponential growth of web image data, image tagging is becoming crucial in many image based applications such as object recognition and content-based image retrieval. Despite the great progress achieved in automatic recognition technologies, none has yet provided a satisfactory solution to be widely useful in solving generic image recognition problems. So far, only manual tagging can provide reliable tagging results. However, such work is tedious, costly and workers have no motivation. In this paper, we propose an online image tagging system, EyeDentifyIt, driven by image-click-ads framework, which motivates crowdsourcing workers as well as general web users to tag images at high quality for low cost with low workload. A series of usability studies are presented to demonstrate how EyeDentifyIt provides improved user motivations and requires less workload, compared to state-of-the-art approaches.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133660355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reconstructing Missing Areas in Facial Images 人脸图像缺失区域的重建
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.68
Christoph Jansen, Radek Mackowiak, N. Hezel, Moritz Ufer, Gregor Altstadt, K. U. Barthel
In this paper, we present a novel approach to reconstruct missing areas in facial images by using a series of Restricted Boltzman Machines (RBMs). RBMs created with a low number of hidden neurons generalize well and are able to reconstruct basic structures in the missing areas. On the other hand networks with many hidden neurons tend to emphasize details, when using the reconstruction of the previous, more generalized RBMs, as their input. Since trained RBMs are fast in encoding and decoding data by design, our method is also suitable for processing video streams.
本文提出了一种利用一系列受限玻尔兹曼机(rbm)重建人脸图像缺失区域的新方法。隐藏神经元数量少的rbm泛化效果好,能够重建缺失区域的基本结构。另一方面,当使用先前更广义的rbm的重建作为输入时,具有许多隐藏神经元的网络倾向于强调细节。由于经过训练的rbm在编码和解码数据的设计上是快速的,因此我们的方法也适用于视频流的处理。
{"title":"Reconstructing Missing Areas in Facial Images","authors":"Christoph Jansen, Radek Mackowiak, N. Hezel, Moritz Ufer, Gregor Altstadt, K. U. Barthel","doi":"10.1109/ISM.2015.68","DOIUrl":"https://doi.org/10.1109/ISM.2015.68","url":null,"abstract":"In this paper, we present a novel approach to reconstruct missing areas in facial images by using a series of Restricted Boltzman Machines (RBMs). RBMs created with a low number of hidden neurons generalize well and are able to reconstruct basic structures in the missing areas. On the other hand networks with many hidden neurons tend to emphasize details, when using the reconstruction of the previous, more generalized RBMs, as their input. Since trained RBMs are fast in encoding and decoding data by design, our method is also suitable for processing video streams.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131684597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Human-Based Video Browsing - Investigating Interface Design for Fast Video Browsing 基于人的视频浏览——探索快速视频浏览的界面设计
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.104
Wolfgang Hürst, R. V. D. Werken
The Video Browser Showdown (VBS) is an annual event where researchers evaluate their video search systems in a competitive setting. Searching in videos is often a two-step process: first some sort of pre-filtering is done, where, for example, users query an indexed archive of files, followed by a human-based browsing, where users skim the returned result set in search for the relevant file or portion of it. The VBS aims at this whole search process, focusing in particular on its interactive aspects. Encouraged by previous years' results, we created a system that purely addresses the latter issue, i.e., interface and interaction design. By eliminating all kind of video indexing and query processing, we were aiming to demonstrate the importance of good interface design for video search and that its relevance is often underestimated by today's systems. This claim is clearly proven by the results our system achieved in the VBS 2015 competition, where our approach was on a par with the top performing ones. In this paper, we will describe our system along with related design decisions, present our results from the VBS event, and discuss them in further detail.
视频浏览器对决(VBS)是一年一度的活动,研究人员在竞争环境中评估他们的视频搜索系统。在视频中搜索通常是一个两步的过程:首先进行某种预过滤,例如,用户查询索引的文件存档,然后是基于人的浏览,用户浏览返回的结果集以搜索相关文件或其中的一部分。VBS的目标是整个搜索过程,特别关注其互动方面。受前几年成果的鼓舞,我们创建了一个纯粹解决后一个问题的系统,即界面和交互设计。通过消除所有类型的视频索引和查询处理,我们旨在证明良好的视频搜索界面设计的重要性,以及它的相关性经常被当今的系统所低估。我们的系统在2015年VBS竞赛中取得的成绩清楚地证明了这一点,我们的方法与表现最好的方法不相上下。在本文中,我们将描述我们的系统以及相关的设计决策,展示我们从VBS事件中得到的结果,并进一步详细讨论它们。
{"title":"Human-Based Video Browsing - Investigating Interface Design for Fast Video Browsing","authors":"Wolfgang Hürst, R. V. D. Werken","doi":"10.1109/ISM.2015.104","DOIUrl":"https://doi.org/10.1109/ISM.2015.104","url":null,"abstract":"The Video Browser Showdown (VBS) is an annual event where researchers evaluate their video search systems in a competitive setting. Searching in videos is often a two-step process: first some sort of pre-filtering is done, where, for example, users query an indexed archive of files, followed by a human-based browsing, where users skim the returned result set in search for the relevant file or portion of it. The VBS aims at this whole search process, focusing in particular on its interactive aspects. Encouraged by previous years' results, we created a system that purely addresses the latter issue, i.e., interface and interaction design. By eliminating all kind of video indexing and query processing, we were aiming to demonstrate the importance of good interface design for video search and that its relevance is often underestimated by today's systems. This claim is clearly proven by the results our system achieved in the VBS 2015 competition, where our approach was on a par with the top performing ones. In this paper, we will describe our system along with related design decisions, present our results from the VBS event, and discuss them in further detail.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131982040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2015 IEEE International Symposium on Multimedia (ISM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1