首页 > 最新文献

2009 Seventh International Workshop on Content-Based Multimedia Indexing最新文献

英文 中文
Hierarchical Summarisation of Video Using Ant-Tree Strategy 基于蚁树策略的视频分层摘要
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.50
T. Piatrik, E. Izquierdo
Video summarisation approaches have various fields of application, specifically related to organising, browsing and accessing large video databases. In this paper, the appropriateness of biologically inspired models to tackle these problems is discussed and suitable strategy for unsupervised video summarisation is derived. In our proposal, we model the ability of ants to build live structures with their bodies in order to discover, in a distributed and unsupervised way, a tree-structured organization and summarisation of the video data. An experimental evaluation validating the feasibility and the robustness of this novel approach is presented.
视频摘要方法具有多种应用领域,特别是与组织、浏览和访问大型视频数据库有关。在本文中,讨论了生物学启发模型解决这些问题的适当性,并推导了适合无监督视频摘要的策略。在我们的提案中,我们模拟了蚂蚁用它们的身体建造活结构的能力,以便以一种分布式和无监督的方式发现视频数据的树状结构组织和摘要。实验结果验证了该方法的可行性和鲁棒性。
{"title":"Hierarchical Summarisation of Video Using Ant-Tree Strategy","authors":"T. Piatrik, E. Izquierdo","doi":"10.1109/CBMI.2009.50","DOIUrl":"https://doi.org/10.1109/CBMI.2009.50","url":null,"abstract":"Video summarisation approaches have various fields of application, specifically related to organising, browsing and accessing large video databases. In this paper, the appropriateness of biologically inspired models to tackle these problems is discussed and suitable strategy for unsupervised video summarisation is derived. In our proposal, we model the ability of ants to build live structures with their bodies in order to discover, in a distributed and unsupervised way, a tree-structured organization and summarisation of the video data. An experimental evaluation validating the feasibility and the robustness of this novel approach is presented.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131552936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain 基于运动矢量的MPEG压缩域运动目标检测与跟踪
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.33
T. Yokoyama, Toshiki Iwasaki, Toshinori Watanabe
As MPEG standards prevail, the opportunities to handle MPEG compressed videos increase, and the video indexing and management that can directly process the compressed videos become important. MPEG video coding standards use motion compensation to compress video data, and the motion compensation generates motion vectors that contain motion information similar to optical flows between regions in different frames. Although motion vectors are useful for video analysis, they are not always generated along moving objects, and it is difficult to analyze moving objects using only these vectors. In this paper, we propose a moving object detection and tracking method in the MPEG compressed domain for video surveillance and management. In our method, we introduce images that record moving regions and accumulate unmoving regions in which the moving objects are expected to exist after the current frame. By utilizing these images, we can detect and track moving objects using only motion vectors even if the motion vectors of moving objects become zero vectors due to their behaviors and are lost due to their picture type. We demonstrate the effectiveness of the proposed method through several experiments using actual videos acquired by an MPEG video camera.
随着MPEG标准的普及,处理MPEG压缩视频的机会越来越多,能够直接处理压缩视频的视频索引和管理变得非常重要。MPEG视频编码标准采用运动补偿对视频数据进行压缩,运动补偿产生的运动矢量包含不同帧内区域间类似于光流的运动信息。虽然运动矢量对视频分析很有用,但它们并不总是沿着运动物体生成的,并且仅使用这些矢量很难分析运动物体。本文提出了一种用于视频监控与管理的MPEG压缩域运动目标检测与跟踪方法。在我们的方法中,我们引入了记录运动区域的图像,并积累了不运动区域,其中在当前帧之后预计会存在运动物体。通过利用这些图像,即使运动物体的运动矢量由于其行为而变为零矢量,并且由于其图像类型而丢失,我们也可以仅使用运动矢量来检测和跟踪运动物体。通过对MPEG摄像机采集的实际视频进行实验,验证了该方法的有效性。
{"title":"Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain","authors":"T. Yokoyama, Toshiki Iwasaki, Toshinori Watanabe","doi":"10.1109/CBMI.2009.33","DOIUrl":"https://doi.org/10.1109/CBMI.2009.33","url":null,"abstract":"As MPEG standards prevail, the opportunities to handle MPEG compressed videos increase, and the video indexing and management that can directly process the compressed videos become important. MPEG video coding standards use motion compensation to compress video data, and the motion compensation generates motion vectors that contain motion information similar to optical flows between regions in different frames. Although motion vectors are useful for video analysis, they are not always generated along moving objects, and it is difficult to analyze moving objects using only these vectors. In this paper, we propose a moving object detection and tracking method in the MPEG compressed domain for video surveillance and management. In our method, we introduce images that record moving regions and accumulate unmoving regions in which the moving objects are expected to exist after the current frame. By utilizing these images, we can detect and track moving objects using only motion vectors even if the motion vectors of moving objects become zero vectors due to their behaviors and are lost due to their picture type. We demonstrate the effectiveness of the proposed method through several experiments using actual videos acquired by an MPEG video camera.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115001418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Query Generation from Multiple Media Examples 从多个媒体示例生成查询
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.13
Reede Ren, J. Jose
This paper exploits a media document representation called feature terms to generate a query from multiple media examples, e.g. images. A feature term denotes a continuous interval of a media feature dimension. This approach (1) helps feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three criteria, minimised χ2, minimised AC/DC and maximised entropy, are proposed to optimise feature term selection. Two ranking functions, KL divergence and BM25, are used for relevance estimation. Experiments on Corel photo collection and TRECVid 2006 collection show the effectiveness in image/video retrieval.
本文利用一种称为特征项的媒体文档表示,从多个媒体示例(例如图像)中生成查询。特征项表示媒体特征维度的连续间隔。这种方法(1)有助于从多个示例中积累特征;(2)探索基于文本的多媒体检索模型。提出了最小化χ2、最小化AC/DC和最大化熵三个标准来优化特征项选择。两个排序函数,KL散度和BM25,用于相关性估计。在Corel照片集和TRECVid 2006集上的实验表明了该方法在图像/视频检索中的有效性。
{"title":"Query Generation from Multiple Media Examples","authors":"Reede Ren, J. Jose","doi":"10.1109/CBMI.2009.13","DOIUrl":"https://doi.org/10.1109/CBMI.2009.13","url":null,"abstract":"This paper exploits a media document representation called feature terms to generate a query from multiple media examples, e.g. images. A feature term denotes a continuous interval of a media feature dimension. This approach (1) helps feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three criteria, minimised χ2, minimised AC/DC and maximised entropy, are proposed to optimise feature term selection. Two ranking functions, KL divergence and BM25, are used for relevance estimation. Experiments on Corel photo collection and TRECVid 2006 collection show the effectiveness in image/video retrieval.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127623505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Picadomo: Faceted Image Browsing for Mobile Devices Picadomo:移动设备的面形图像浏览
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.34
Adrian Hub, Daniel Blank, A. Henrich, W. Müller
Picadomo combines content-based image retrieval and faceted search on mobile devices. It is designed for finding images with desired visual properties, tags or other known metadata. Due to the limitations of mobile devices such as small screen sizes and low processing power, we had to carefully select the features that come in use (dominant color, GPS data, tags, etc.). With Picadomo the user can pick visualized facets directly via touch screen, while using very little screen size for the facet browsing navigation. We present our architecture, the facets used for image browsing, our new control concept and user experiments.
Picadomo结合了基于内容的图像检索和移动设备上的分面搜索。它被设计用于查找具有所需视觉属性、标签或其他已知元数据的图像。由于移动设备的局限性,如小屏幕尺寸和低处理能力,我们必须仔细选择要使用的功能(主色,GPS数据,标签等)。使用Picadomo,用户可以直接通过触摸屏选择可视化的facet,同时使用很小的屏幕尺寸进行facet浏览导航。我们介绍了我们的架构,用于图像浏览的方面,我们的新控制概念和用户实验。
{"title":"Picadomo: Faceted Image Browsing for Mobile Devices","authors":"Adrian Hub, Daniel Blank, A. Henrich, W. Müller","doi":"10.1109/CBMI.2009.34","DOIUrl":"https://doi.org/10.1109/CBMI.2009.34","url":null,"abstract":"Picadomo combines content-based image retrieval and faceted search on mobile devices. It is designed for finding images with desired visual properties, tags or other known metadata. Due to the limitations of mobile devices such as small screen sizes and low processing power, we had to carefully select the features that come in use (dominant color, GPS data, tags, etc.). With Picadomo the user can pick visualized facets directly via touch screen, while using very little screen size for the facet browsing navigation. We present our architecture, the facets used for image browsing, our new control concept and user experiments.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128188531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Scalable Spatio-Temporal Video Indexing Using Sparse Multiscale Patches 基于稀疏多尺度补丁的可扩展时空视频索引
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.48
Paolo Piro, S. Anthoine, E. Debreuve, M. Barlaud
In this paper we address the problem of scalable video indexing. We propose a new framework combining sparse spatial multiscale patches and Group of Pictures (GoP) motion patches. The distributions of these sets of patches are compared via the Kullback-Leibler divergence estimated in a non-parametric framework using a k-th Nearest Neighbor (kNN) estimator. We evaluated this similarity measure on selected videos from the ICOS-HD ANR project, probing in particular its robustness to resampling and compression and thus showing its scalability on heterogeneous networks.
本文主要研究可扩展的视频索引问题。我们提出了一种结合稀疏空间多尺度补丁和图像组(Group of Pictures, GoP)运动补丁的新框架。通过在非参数框架中使用第k近邻(kNN)估计器估计的Kullback-Leibler散度来比较这些补丁集的分布。我们在ICOS-HD ANR项目中选定的视频上评估了这种相似性度量,特别探讨了它对重采样和压缩的鲁棒性,从而显示了它在异构网络上的可扩展性。
{"title":"Scalable Spatio-Temporal Video Indexing Using Sparse Multiscale Patches","authors":"Paolo Piro, S. Anthoine, E. Debreuve, M. Barlaud","doi":"10.1109/CBMI.2009.48","DOIUrl":"https://doi.org/10.1109/CBMI.2009.48","url":null,"abstract":"In this paper we address the problem of scalable video indexing. We propose a new framework combining sparse spatial multiscale patches and Group of Pictures (GoP) motion patches. The distributions of these sets of patches are compared via the Kullback-Leibler divergence estimated in a non-parametric framework using a k-th Nearest Neighbor (kNN) estimator. We evaluated this similarity measure on selected videos from the ICOS-HD ANR project, probing in particular its robustness to resampling and compression and thus showing its scalability on heterogeneous networks.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134254845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-Driven Design of Audiovisual Indexing Processes for Search-Based Applications 基于搜索应用的视听索引过程的模型驱动设计
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.51
P. Fraternali, M. Brambilla, A. Bozzon
As the Web becomes a platform for multimedia content fruition, audiovisual search assumes a central role in providing users with the content most adequate to their information needs. A key issue for enabling audiovisual search is extracting indexable knowledge from opaque media. Such a process is heavily constrained by scalability and performance issues and must be able to flexibly incorporate specialized components for educing selected features from media elements. This paper shows how the use of a model-driven approach can help designers specify multimedia indexing processes, verify properties of interest in such processes, and generate the code that orchestrates the components, so as to enable rapid prototyping of content analysis processes in presence of evolving requirements.
随着Web成为多媒体内容成果的平台,视听搜索在向用户提供最适合其信息需求的内容方面扮演了核心角色。实现视听搜索的一个关键问题是从不透明媒体中提取可索引的知识。这种流程受到可伸缩性和性能问题的严重限制,必须能够灵活地合并专门的组件,以便从媒体元素中提取选定的功能。本文展示了如何使用模型驱动的方法来帮助设计人员指定多媒体索引过程,验证这些过程中感兴趣的属性,并生成编排组件的代码,以便在不断变化的需求中实现内容分析过程的快速原型。
{"title":"Model-Driven Design of Audiovisual Indexing Processes for Search-Based Applications","authors":"P. Fraternali, M. Brambilla, A. Bozzon","doi":"10.1109/CBMI.2009.51","DOIUrl":"https://doi.org/10.1109/CBMI.2009.51","url":null,"abstract":"As the Web becomes a platform for multimedia content fruition, audiovisual search assumes a central role in providing users with the content most adequate to their information needs. A key issue for enabling audiovisual search is extracting indexable knowledge from opaque media. Such a process is heavily constrained by scalability and performance issues and must be able to flexibly incorporate specialized components for educing selected features from media elements. This paper shows how the use of a model-driven approach can help designers specify multimedia indexing processes, verify properties of interest in such processes, and generate the code that orchestrates the components, so as to enable rapid prototyping of content analysis processes in presence of evolving requirements.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130908317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Comparison of L_1 Norm and L_2 Norm Multiple Kernel SVMs in Image and Video Classification L_1范数与L_2范数多核支持向量机在图像和视频分类中的比较
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.44
F. Yan, K. Mikolajczyk, J. Kittler, M. Tahir
SVM is one of the state-of-the-art techniques for image and video classification. When multiple kernels are available, the recently introduced multiple kernel SVM (MK-SVM) learns an optimal linear combination of the kernels, providing a new method for information fusion. In this paper we study how the behaviour of MK-SVM is affected by the norm used to regularise the kernel weights to be learnt. Through experiments on three image/video classification datasets as well as on synthesised data, new insights are gained as to how the choice of regularisation norm should be made, especially when MK-SVM is applied to image/video classification problems.
支持向量机是图像和视频分类的最新技术之一。在有多个核的情况下,最近提出的多核支持向量机(MK-SVM)学习核的最优线性组合,为信息融合提供了一种新的方法。本文研究了用于正则化待学习核权值的范数对MK-SVM行为的影响。通过对三个图像/视频分类数据集以及合成数据的实验,我们对正则化范数的选择有了新的认识,特别是当MK-SVM应用于图像/视频分类问题时。
{"title":"A Comparison of L_1 Norm and L_2 Norm Multiple Kernel SVMs in Image and Video Classification","authors":"F. Yan, K. Mikolajczyk, J. Kittler, M. Tahir","doi":"10.1109/CBMI.2009.44","DOIUrl":"https://doi.org/10.1109/CBMI.2009.44","url":null,"abstract":"SVM is one of the state-of-the-art techniques for image and video classification. When multiple kernels are available, the recently introduced multiple kernel SVM (MK-SVM) learns an optimal linear combination of the kernels, providing a new method for information fusion. In this paper we study how the behaviour of MK-SVM is affected by the norm used to regularise the kernel weights to be learnt. Through experiments on three image/video classification datasets as well as on synthesised data, new insights are gained as to how the choice of regularisation norm should be made, especially when MK-SVM is applied to image/video classification problems.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116545853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Compound Document Analysis by Fusing Evidence Across Media 跨媒体融合证据的复合文献分析
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.35
S. Nikolopoulos, Christina Lakka, Y. Kompatsiaris, Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis
In this paper a cross media analysis scheme for the semantic interpretation of compound documents is presented. It is essentially a late-fusion mechanism that operates on top of single-media extractors output and it’s main novelty relies on using the evidence extracted from heterogeneous media sources to perform probabilistic inference on a bayesian network that incorporates knowledge about the domain. Experiments performed on a set of 54 compound documents showed that the proposed scheme is able to exploit the existing cross media relations and achieve performance improvements.
本文提出了一种用于复合文档语义解释的跨媒体分析方案。它本质上是一种基于单媒体提取器输出的后期融合机制,其主要新颖之处在于使用从异构媒体源提取的证据在包含领域知识的贝叶斯网络上执行概率推理。在一组54个复合文档上进行的实验表明,所提出的方案能够利用现有的跨媒体关系并实现性能改进。
{"title":"Compound Document Analysis by Fusing Evidence Across Media","authors":"S. Nikolopoulos, Christina Lakka, Y. Kompatsiaris, Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis","doi":"10.1109/CBMI.2009.35","DOIUrl":"https://doi.org/10.1109/CBMI.2009.35","url":null,"abstract":"In this paper a cross media analysis scheme for the semantic interpretation of compound documents is presented. It is essentially a late-fusion mechanism that operates on top of single-media extractors output and it’s main novelty relies on using the evidence extracted from heterogeneous media sources to perform probabilistic inference on a bayesian network that incorporates knowledge about the domain. Experiments performed on a set of 54 compound documents showed that the proposed scheme is able to exploit the existing cross media relations and achieve performance improvements.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126380991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Kernel Discriminant Analysis Using Triangular Kernel for Semantic Scene Classification 基于三角核的语义场景分类核判别分析
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.47
M. Tahir, J. Kittler, F. Yan, K. Mikolajczyk
Semantic scene classification is a challenging research problem that aims to categorise images into semantic classes such as beaches, sunsets or mountains. This problem can be formulated as multi-labeled classification problem where an image can belong to more than one conceptual class such as sunsets and beaches at the same time. Recently, Kernel Discriminant Analysis combined with spectral regression (SR-KDA) has been successfully used for face, text and spoken letter recognition. But SR-KDA method works only with positive definite symmetric matrices. In this paper, we have modified this method to support both definite and indefinite symmetric matrices. The main idea is to use LDLT decomposition instead of Cholesky decomposition. The modified SR-KDA is applied to scene database involving 6 concepts. We validate the advocated approach and demonstrate that it yields significant performance gains when conditionally positive definite triangular kernel is used instead of positive definite symmetric kernels such as linear, polynomial or RBF. The results also indicate performance gains when compared with the state-of-the art multi-label methods for semantic scene classification.
语义场景分类是一个具有挑战性的研究问题,其目的是对图像进行语义分类,如海滩、日落或山脉。这个问题可以表述为多标签分类问题,其中图像可以同时属于多个概念类,例如日落和海滩。近年来,核判别分析结合谱回归(SR-KDA)已成功地应用于人脸、文本和语音字母识别。但SR-KDA方法只适用于正定对称矩阵。在本文中,我们对该方法进行了改进,使其同时支持定对称矩阵和不定对称矩阵。主要思想是使用LDLT分解而不是Cholesky分解。将改进后的SR-KDA应用于涉及6个概念的场景数据库。我们验证了所提倡的方法,并证明当使用条件正定三角形核而不是正定对称核(如线性,多项式或RBF)时,它会产生显着的性能提升。结果还表明,与最先进的多标签语义场景分类方法相比,性能有所提高。
{"title":"Kernel Discriminant Analysis Using Triangular Kernel for Semantic Scene Classification","authors":"M. Tahir, J. Kittler, F. Yan, K. Mikolajczyk","doi":"10.1109/CBMI.2009.47","DOIUrl":"https://doi.org/10.1109/CBMI.2009.47","url":null,"abstract":"Semantic scene classification is a challenging research problem that aims to categorise images into semantic classes such as beaches, sunsets or mountains. This problem can be formulated as multi-labeled classification problem where an image can belong to more than one conceptual class such as sunsets and beaches at the same time. Recently, Kernel Discriminant Analysis combined with spectral regression (SR-KDA) has been successfully used for face, text and spoken letter recognition. But SR-KDA method works only with positive definite symmetric matrices. In this paper, we have modified this method to support both definite and indefinite symmetric matrices. The main idea is to use LDLT decomposition instead of Cholesky decomposition. The modified SR-KDA is applied to scene database involving 6 concepts. We validate the advocated approach and demonstrate that it yields significant performance gains when conditionally positive definite triangular kernel is used instead of positive definite symmetric kernels such as linear, polynomial or RBF. The results also indicate performance gains when compared with the state-of-the art multi-label methods for semantic scene classification.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117118131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Multimodal Space for Rushes Representation and Retrieval 快件表示与检索的多模态空间
Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.28
Sergio Benini, Luca Canini, P. Migliorati, R. Leonardi
In video content analysis, growing research effort aims at characterising a specific type of unedited content, called rushes. This raw material, used by broadcasters and film studios for editing video programmes, usually lies un-annotated in a huge database. In this work we aim at retrieving a desired type of rush by representing the whole database content in a multimodal space. Each rush content is mapped into a trajectory whose coordinates are connected to multimodal features and filming techniques used by cameramen while shooting. The trajectory evolution over time provides a strong characterisation of the video, so that different types of rushes are located into different regions of the multimodal space. The ability of the proposed method has been tested by retrieving similar rushes from a large database provided by EiTB, the Basque Country main broadcaster.
在视频内容分析中,越来越多的研究工作旨在描述一种特定类型的未编辑内容,称为rush。这些被广播公司和电影制片厂用来编辑视频节目的原始材料,通常在一个巨大的数据库中没有任何注释。在这项工作中,我们的目标是通过在多模态空间中表示整个数据库内容来检索所需类型的rush。每个rush内容都被映射到一个轨迹中,其坐标与摄影师在拍摄时使用的多模式特征和拍摄技术相连接。随着时间的推移,轨迹演变提供了视频的强烈特征,因此不同类型的匆忙位于多模式空间的不同区域。通过从巴斯克地区主要广播公司EiTB提供的大型数据库中检索类似的快讯,测试了拟议方法的能力。
{"title":"Multimodal Space for Rushes Representation and Retrieval","authors":"Sergio Benini, Luca Canini, P. Migliorati, R. Leonardi","doi":"10.1109/CBMI.2009.28","DOIUrl":"https://doi.org/10.1109/CBMI.2009.28","url":null,"abstract":"In video content analysis, growing research effort aims at characterising a specific type of unedited content, called rushes. This raw material, used by broadcasters and film studios for editing video programmes, usually lies un-annotated in a huge database. In this work we aim at retrieving a desired type of rush by representing the whole database content in a multimodal space. Each rush content is mapped into a trajectory whose coordinates are connected to multimodal features and filming techniques used by cameramen while shooting. The trajectory evolution over time provides a strong characterisation of the video, so that different types of rushes are located into different regions of the multimodal space. The ability of the proposed method has been tested by retrieving similar rushes from a large database provided by EiTB, the Basque Country main broadcaster.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134639053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2009 Seventh International Workshop on Content-Based Multimedia Indexing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1