首页 > 最新文献

Proceedings of the 21st ACM international conference on Multimedia最新文献

英文 中文
Analysis and forecasting of trending topics in online media streams 在线媒体流中趋势话题的分析和预测
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502117
Tim Althoff, Damian Borth, Jörn Hees, A. Dengel
Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better understanding of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thousands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days.
在网络上可用的大量信息中,社交媒体流捕捉了人们当前关注的内容以及他们对某些话题的感受。这种趋势话题的感知在多媒体系统中起着至关重要的作用,例如趋势感知推荐和视频概念检测系统的自动词汇选择。正确利用热门话题需要更好地理解它们在不同社交媒体流中的各种特征。为此,我们提出了第一个综合研究,涉及三个主要的在线和社交媒体流,Twitter,谷歌和维基百科,在一整年的观察期涵盖了数千个热门话题。我们的研究结果表明,根据个人需求,人们不一定要转向Twitter获取有关当前事件的信息,而且一些媒体流强烈强调特定类别的内容。作为我们的第二个关键贡献,我们进一步提出了一种新颖的方法,用于预测趋势主题出现的生命周期这一具有挑战性的任务。我们的全自动方法基于最近邻预测技术,利用我们的假设,即语义相似的主题表现出相似的行为。我们在维基百科页面浏览量统计数据的大规模数据集上证明,与基线方法相比,所提出的方法预测的浏览量更接近实际浏览量统计数据9-48k,并且在长达14天的时间段内实现了45-19%的平均百分比误差。
{"title":"Analysis and forecasting of trending topics in online media streams","authors":"Tim Althoff, Damian Borth, Jörn Hees, A. Dengel","doi":"10.1145/2502081.2502117","DOIUrl":"https://doi.org/10.1145/2502081.2502117","url":null,"abstract":"Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better understanding of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thousands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73911376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Towards a comprehensive computational model foraesthetic assessment of videos 面向视频审美评价的综合计算模型
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2508119
Subhabrata Bhattacharya, Behnaz Nojavanasghari, Tao Chen, Dong Liu, Shih-Fu Chang, M. Shah
In this paper we propose a novel aesthetic model emphasizing psycho-visual statistics extracted from multiple levels in contrast to earlier approaches that rely only on descriptors suited for image recognition or based on photographic principles. At the lowest level, we determine dark-channel, sharpness and eye-sensitivity statistics over rectangular cells within a frame. At the next level, we extract Sentibank features (1,200 pre-trained visual classifiers) on a given frame, that invoke specific sentiments such as "colorful clouds", "smiling face" etc. and collect the classifier responses as frame-level statistics. At the topmost level, we extract trajectories from video shots. Using viewer's fixation priors, the trajectories are labeled as foreground, and background/camera on which statistics are computed. Additionally, spatio-temporal local binary patterns are computed that capture texture variations in a given shot. Classifiers are trained on individual feature representations independently. On thorough evaluation of 9 different types of features, we select the best features from each level -- dark channel, affect and camera motion statistics. Next, corresponding classifier scores are integrated in a sophisticated low-rank fusion framework to improve the final prediction scores. Our approach demonstrates strong correlation with human prediction on 1,000 broadcast quality videos released by NHK as an aesthetic evaluation dataset.
在本文中,我们提出了一种新的美学模型,强调从多个层面提取的心理视觉统计,而不是仅仅依赖于适合图像识别的描述符或基于摄影原理的早期方法。在最低层次上,我们在一个框架内的矩形单元上确定暗通道、锐度和眼灵敏度统计。在下一层,我们在给定的框架上提取Sentibank特征(1200个预训练的视觉分类器),这些特征调用特定的情感,如“彩云”、“笑脸”等,并收集分类器响应作为框架级统计。在最顶层,我们从视频镜头中提取轨迹。利用观看者的注视先验,将轨迹标记为前景,并在其上计算统计数据的背景/相机。此外,计算时空局部二进制模式,以捕获给定镜头中的纹理变化。分类器是在单个特征表示上独立训练的。在对9种不同类型的特征进行全面评估后,我们从每个级别中选择最佳特征——暗通道、影响和相机运动统计。接下来,将相应的分类器分数整合到一个复杂的低秩融合框架中,以提高最终的预测分数。我们的方法与人类对NHK作为美学评估数据集发布的1000个广播质量视频的预测具有很强的相关性。
{"title":"Towards a comprehensive computational model foraesthetic assessment of videos","authors":"Subhabrata Bhattacharya, Behnaz Nojavanasghari, Tao Chen, Dong Liu, Shih-Fu Chang, M. Shah","doi":"10.1145/2502081.2508119","DOIUrl":"https://doi.org/10.1145/2502081.2508119","url":null,"abstract":"In this paper we propose a novel aesthetic model emphasizing psycho-visual statistics extracted from multiple levels in contrast to earlier approaches that rely only on descriptors suited for image recognition or based on photographic principles. At the lowest level, we determine dark-channel, sharpness and eye-sensitivity statistics over rectangular cells within a frame. At the next level, we extract Sentibank features (1,200 pre-trained visual classifiers) on a given frame, that invoke specific sentiments such as \"colorful clouds\", \"smiling face\" etc. and collect the classifier responses as frame-level statistics. At the topmost level, we extract trajectories from video shots. Using viewer's fixation priors, the trajectories are labeled as foreground, and background/camera on which statistics are computed. Additionally, spatio-temporal local binary patterns are computed that capture texture variations in a given shot. Classifiers are trained on individual feature representations independently. On thorough evaluation of 9 different types of features, we select the best features from each level -- dark channel, affect and camera motion statistics. Next, corresponding classifier scores are integrated in a sophisticated low-rank fusion framework to improve the final prediction scores. Our approach demonstrates strong correlation with human prediction on 1,000 broadcast quality videos released by NHK as an aesthetic evaluation dataset.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74239577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Using quadratic programming to estimate feature relevance in structural analyses of music 用二次规划估计音乐结构分析中的特征相关性
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502124
Jordan B. L. Smith, E. Chew
To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.
为了识别音乐中的重复模式和对比部分,通常使用自相似矩阵(ssm)来可视化和估计结构。我们介绍了一种来自录音的ssm的新应用:使用它们来了解听者注释背后的潜在推理。我们使用由音乐驱动的音频特征在不同时间尺度上生成的ssm来表示对结构注释的贡献。由于听众的注意力可以在整个作品的音乐特征(如节奏、音色和和声)之间转移,我们进一步将ssm分解为分段组件,并使用二次规划(QP)最小化这些组件的线性总和与注释描述之间的距离。我们假设特征组件上的最佳分段明智权重可以指示听众在注释一首乐曲时所关注的特征,从而可以帮助我们理解为什么两个听众对一首乐曲的结构意见不一致。我们讨论了一些例子,证实了特征相关性在整个作品中有所不同的说法,使用我们的方法来调查听众之间解释的差异,最后提出了我们方法的一些变化。
{"title":"Using quadratic programming to estimate feature relevance in structural analyses of music","authors":"Jordan B. L. Smith, E. Chew","doi":"10.1145/2502081.2502124","DOIUrl":"https://doi.org/10.1145/2502081.2502124","url":null,"abstract":"To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"365 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83020444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Session details: Annotation 会话详细信息:
Pub Date : 2013-10-21 DOI: 10.1145/3245300
Pablo Caesar
{"title":"Session details: Annotation","authors":"Pablo Caesar","doi":"10.1145/3245300","DOIUrl":"https://doi.org/10.1145/3245300","url":null,"abstract":"","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"182 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83023226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring discriminative pose sub-patterns for effective action classification 探索有效动作分类的判别姿势子模式
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502094
Xu Zhao, Yuncai Liu, Yun Fu
Articulated configuration of human body parts is an essential representation of human motion, therefore is well suited for classifying human actions. In this work, we propose a novel approach to exploring the discriminative pose sub-patterns for effective action classification. These pose sub-patterns are extracted from a predefined set of 3D poses represented by hierarchical motion angles. The basic idea is motivated by the two observations: (1) There exist representative sub-patterns in each action class, from which the action class can be easily differentiated. (2) These sub-patterns frequently appear in the action class. By constructing a connection between frequent sub-patterns and the discriminative measure, we develop the SSPI, namely, the Support Sub-Pattern Induced learning algorithm for simultaneous feature selection and feature learning. Based on the algorithm, discriminative pose sub-patterns can be identified and used as a series of "magnetic centers" on the surface of normalized super-sphere for feature transform. The "attractive forces" from the sub-patterns determine the direction and step-length of the transform. This transformation makes a feature more discriminative while maintaining dimensionality invariance. Comprehensive experimental studies conducted on a large scale motion capture dataset demonstrate the effectiveness of the proposed approach for action classification and the superior performance over the state-of-the-art techniques.
人体各部位的关节结构是人体运动的基本表征,因此非常适合于对人体动作进行分类。在这项工作中,我们提出了一种新的方法来探索有效的动作分类的判别姿势子模式。这些姿态子模式是从预定义的由分层运动角度表示的3D姿态集合中提取的。其基本思想源于两个观察结果:(1)每个动作类中都存在代表性的子模式,可以很容易地从中区分动作类。(2)这些子模式经常出现在动作类中。通过构建频繁子模式与判别测度之间的联系,我们开发了SSPI,即支持子模式诱导学习算法,用于同时进行特征选择和特征学习。基于该算法,可以识别出判别姿态子模式,并将其作为归一化超球表面的一系列“磁中心”进行特征变换。来自子模式的“吸引力”决定了转换的方向和步长。这种转换使特征更具判别性,同时保持维数不变性。在大规模动作捕捉数据集上进行的综合实验研究表明,所提出的方法对动作分类是有效的,并且优于目前最先进的技术。
{"title":"Exploring discriminative pose sub-patterns for effective action classification","authors":"Xu Zhao, Yuncai Liu, Yun Fu","doi":"10.1145/2502081.2502094","DOIUrl":"https://doi.org/10.1145/2502081.2502094","url":null,"abstract":"Articulated configuration of human body parts is an essential representation of human motion, therefore is well suited for classifying human actions. In this work, we propose a novel approach to exploring the discriminative pose sub-patterns for effective action classification. These pose sub-patterns are extracted from a predefined set of 3D poses represented by hierarchical motion angles. The basic idea is motivated by the two observations: (1) There exist representative sub-patterns in each action class, from which the action class can be easily differentiated. (2) These sub-patterns frequently appear in the action class. By constructing a connection between frequent sub-patterns and the discriminative measure, we develop the SSPI, namely, the Support Sub-Pattern Induced learning algorithm for simultaneous feature selection and feature learning. Based on the algorithm, discriminative pose sub-patterns can be identified and used as a series of \"magnetic centers\" on the surface of normalized super-sphere for feature transform. The \"attractive forces\" from the sub-patterns determine the direction and step-length of the transform. This transformation makes a feature more discriminative while maintaining dimensionality invariance. Comprehensive experimental studies conducted on a large scale motion capture dataset demonstrate the effectiveness of the proposed approach for action classification and the superior performance over the state-of-the-art techniques.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84524232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Querying for video events by semantic signatures from few examples 基于语义签名查询视频事件的几个例子
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502160
M. Mazloom, A. Habibian, Cees G. M. Snoek
We aim to query web video for complex events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic signature representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events. Since it is unknown what similarity metric and query fusion to use in such an event retrieval setting, we perform three experiments on unconstrained web videos from the TRECVID event detection task. It reveals that: retrieval with semantic signatures using normalized correlation as similarity metric outperforms a low-level bag-of-words alternative, multiple queries are best combined using late fusion with an average operator, and event retrieval is preferred over event classification when less than eight positive video examples are available.
我们的目标是仅使用少数视频查询示例查询复杂事件的网络视频,其中标准方法从数百个示例中学习排名。我们考虑一个语义签名表示,由现成的概念检测器组成,以捕获事件语义外观的变化。由于未知在这样的事件检索设置中使用什么样的相似性度量和查询融合,我们对来自TRECVID事件检测任务的无约束web视频进行了三个实验。它表明:使用归一化相关性作为相似性度量的语义签名检索优于低级词袋替代方法,多个查询最好使用平均算子的后期融合组合,当可用的正面视频示例少于8个时,事件检索优于事件分类。
{"title":"Querying for video events by semantic signatures from few examples","authors":"M. Mazloom, A. Habibian, Cees G. M. Snoek","doi":"10.1145/2502081.2502160","DOIUrl":"https://doi.org/10.1145/2502081.2502160","url":null,"abstract":"We aim to query web video for complex events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic signature representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events. Since it is unknown what similarity metric and query fusion to use in such an event retrieval setting, we perform three experiments on unconstrained web videos from the TRECVID event detection task. It reveals that: retrieval with semantic signatures using normalized correlation as similarity metric outperforms a low-level bag-of-words alternative, multiple queries are best combined using late fusion with an average operator, and event retrieval is preferred over event classification when less than eight positive video examples are available.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84589933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Lecture video segmentation by automatically analyzing the synchronized slides 讲座视频分割,自动分析同步幻灯片
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2508115
Xiaoyin Che, Haojin Yang, C. Meinel
In this paper we propose a solution which segments lecture video by analyzing its supplementary synchronized slides. The slides content derives automatically from OCR (Optical Character Recognition) process with an approximate accuracy of 90%. Then we partition the slides into different subtopics by examining their logical relevance. Since the slides are synchronized with the video stream, the subtopics of the slides indicate exactly the segments of the video. Our evaluation reveals that the average length of segments for each lecture is ranged from 5 to 15 minutes, and 45% segments achieved from test datasets are logically reasonable.
本文提出了一种通过分析讲座视频的补充同步幻灯片来分割讲座视频的解决方案。幻灯片内容自动从OCR(光学字符识别)过程中提取,准确率约为90%。然后,我们通过检查它们的逻辑相关性将幻灯片划分为不同的子主题。由于幻灯片与视频流是同步的,因此幻灯片的子主题精确地表示视频的片段。我们的评估显示,每个讲座的平均片段长度在5到15分钟之间,从测试数据集获得的45%的片段在逻辑上是合理的。
{"title":"Lecture video segmentation by automatically analyzing the synchronized slides","authors":"Xiaoyin Che, Haojin Yang, C. Meinel","doi":"10.1145/2502081.2508115","DOIUrl":"https://doi.org/10.1145/2502081.2508115","url":null,"abstract":"In this paper we propose a solution which segments lecture video by analyzing its supplementary synchronized slides. The slides content derives automatically from OCR (Optical Character Recognition) process with an approximate accuracy of 90%. Then we partition the slides into different subtopics by examining their logical relevance. Since the slides are synchronized with the video stream, the subtopics of the slides indicate exactly the segments of the video. Our evaluation reveals that the average length of segments for each lecture is ranged from 5 to 15 minutes, and 45% segments achieved from test datasets are logically reasonable.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77669029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Tell me what happened here in history 告诉我历史上这里发生了什么
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502272
Jia Chen, Qin Jin, Weipeng Zhang, Shenghua Bao, Zhong Su, Yong Yu
This demo shows our system that takes a landmark image as input, recognizes the landmark from the image and returns historical events of the landmark with related photos. Different from existing landmark related researches, we focus on the temporal dimension of a landmark. Our system automatically recognizes the landmark, shows historical events chronologically and provides detailed photos for the events. To build these functions, we fuse information from multiple online resources.
本演示展示了我们的系统以地标图像作为输入,从图像中识别地标,并返回地标的历史事件和相关照片。与现有的地标相关研究不同,我们关注的是地标的时间维度。我们的系统自动识别地标,按时间顺序显示历史事件,并提供事件的详细照片。为了构建这些功能,我们融合了来自多个在线资源的信息。
{"title":"Tell me what happened here in history","authors":"Jia Chen, Qin Jin, Weipeng Zhang, Shenghua Bao, Zhong Su, Yong Yu","doi":"10.1145/2502081.2502272","DOIUrl":"https://doi.org/10.1145/2502081.2502272","url":null,"abstract":"This demo shows our system that takes a landmark image as input, recognizes the landmark from the image and returns historical events of the landmark with related photos. Different from existing landmark related researches, we focus on the temporal dimension of a landmark. Our system automatically recognizes the landmark, shows historical events chronologically and provides detailed photos for the events. To build these functions, we fuse information from multiple online resources.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77231293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Competitive affective gaming: winning with a smile 竞争情感游戏:用微笑取胜
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502115
André Mourão, João Magalhães
Human-computer interaction (HCI) is expanding towards natural modalities of human expression. Gestures, body movements and other affective interaction techniques can change the way computers interact with humans. In this paper, we propose to extend existing interaction paradigms by including facial expression as a controller in videogames. NovaEmötions is a multiplayer game where players score by acting an emotion through a facial expression. We designed an algorithm to offer an engaging interaction experience using the facial expression. Despite the novelty of the interaction method, our game scoring algorithm kept players engaged and competitive. A user study done with 46 users showed the success and potential for the usage of affective-based interaction in videogames, i.e., the facial expression as the sole controller in videogames. Moreover, we released a novel facial expression dataset with over 41,000 images. These face images were captured in a novel and realistic setting: users playing games where a player's facial expression has an impact on the game score.
人机交互(HCI)正在向人类表达的自然形式扩展。手势、身体动作和其他情感互动技术可以改变计算机与人类互动的方式。在本文中,我们建议通过将面部表情作为电子游戏中的控制器来扩展现有的交互范例。NovaEmötions是一款多人游戏,玩家通过面部表情表现情感来得分。我们设计了一种算法,通过面部表情提供引人入胜的互动体验。尽管互动方法很新颖,但我们的游戏得分算法仍能保持玩家的参与度和竞争力。一项针对46名用户的用户研究显示了在电子游戏中使用情感互动的成功和潜力,即面部表情作为电子游戏中的唯一控制器。此外,我们发布了一个新的面部表情数据集,其中包含超过41,000张图像。这些面部图像是在一个新颖而现实的环境中捕获的:用户在玩游戏,玩家的面部表情对游戏分数有影响。
{"title":"Competitive affective gaming: winning with a smile","authors":"André Mourão, João Magalhães","doi":"10.1145/2502081.2502115","DOIUrl":"https://doi.org/10.1145/2502081.2502115","url":null,"abstract":"Human-computer interaction (HCI) is expanding towards natural modalities of human expression. Gestures, body movements and other affective interaction techniques can change the way computers interact with humans. In this paper, we propose to extend existing interaction paradigms by including facial expression as a controller in videogames. NovaEmötions is a multiplayer game where players score by acting an emotion through a facial expression. We designed an algorithm to offer an engaging interaction experience using the facial expression. Despite the novelty of the interaction method, our game scoring algorithm kept players engaged and competitive. A user study done with 46 users showed the success and potential for the usage of affective-based interaction in videogames, i.e., the facial expression as the sole controller in videogames. Moreover, we released a novel facial expression dataset with over 41,000 images. These face images were captured in a novel and realistic setting: users playing games where a player's facial expression has an impact on the game score.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72964125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Background subtraction via coherent trajectory decomposition 通过相干轨迹分解进行背景减法
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502144
Zhixiang Ren, L. Chia, D. Rajan, Shenghua Gao
Background subtraction, the task to detect moving objects in a scene, is an important step in video analysis. In this paper, we propose an efficient background subtraction method based on coherent trajectory decomposition. We assume that the trajectories from background lie in a low-rank subspace, and foreground trajectories are sparse outliers in this background subspace. Meanwhile, the Markov Random Field (MRF) is used to encode the spatial coherency and trajectory consistency. With the low-rank decomposition and the MRF, our method can better handle videos with moving camera and obtain coherent foreground. Experimental results on a video dataset show our method achieves very competitive performance.
背景减法是视频分析中的一个重要步骤,其任务是检测场景中的运动物体。本文提出了一种基于相干轨迹分解的高效背景减法。我们假设来自背景的轨迹位于低秩子空间中,前景轨迹是该背景子空间中的稀疏离群值。同时,利用马尔可夫随机场(MRF)对空间相干性和轨迹一致性进行编码。通过低秩分解和MRF,该方法可以更好地处理运动摄像机视频,获得连贯的前景。在一个视频数据集上的实验结果表明,我们的方法取得了很好的性能。
{"title":"Background subtraction via coherent trajectory decomposition","authors":"Zhixiang Ren, L. Chia, D. Rajan, Shenghua Gao","doi":"10.1145/2502081.2502144","DOIUrl":"https://doi.org/10.1145/2502081.2502144","url":null,"abstract":"Background subtraction, the task to detect moving objects in a scene, is an important step in video analysis. In this paper, we propose an efficient background subtraction method based on coherent trajectory decomposition. We assume that the trajectories from background lie in a low-rank subspace, and foreground trajectories are sparse outliers in this background subspace. Meanwhile, the Markov Random Field (MRF) is used to encode the spatial coherency and trajectory consistency. With the low-rank decomposition and the MRF, our method can better handle videos with moving camera and obtain coherent foreground. Experimental results on a video dataset show our method achieves very competitive performance.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85515356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of the 21st ACM international conference on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1