首页 > 最新文献

2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文 中文
Semi-supervised spectral clustering with automatic propagation of pairwise constraints 具有成对约束自动传播的半监督谱聚类
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153608
Nicolas Voiron, A. Benoît, Andrei Filip, P. Lambert, B. Ionescu
In our data driven world, clustering is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on ground truth to perform the classification and are usually subject to overtraining issues. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, unsupervised learning tends to provide inferior results to supervised learning. A compromise is then to use learning only for some of the ambiguous classes, in order to boost performances. In this context, this paper studies the impact of pairwise constraints to unsupervised Spectral Clustering. We introduce a new generalization of constraint propagation which maximizes partitioning quality while reducing annotation costs. Experiments show the efficiency of the proposed scheme.
在数据驱动的世界中,聚类对于帮助最终用户和决策者理解信息结构非常重要。监督学习技术依赖于基础事实来执行分类,并且通常受到过度训练问题的影响。另一方面,无监督聚类技术在不处理任何训练数据的情况下研究数据的结构。考虑到任务的难度,无监督学习往往比监督学习提供更差的结果。为了提高性能,一种折衷的方法是只对一些不明确的类使用学习。在此背景下,本文研究了成对约束对无监督谱聚类的影响。我们引入了一种新的泛化约束传播方法,在降低标注成本的同时最大限度地提高了分区质量。实验证明了该方案的有效性。
{"title":"Semi-supervised spectral clustering with automatic propagation of pairwise constraints","authors":"Nicolas Voiron, A. Benoît, Andrei Filip, P. Lambert, B. Ionescu","doi":"10.1109/CBMI.2015.7153608","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153608","url":null,"abstract":"In our data driven world, clustering is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on ground truth to perform the classification and are usually subject to overtraining issues. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, unsupervised learning tends to provide inferior results to supervised learning. A compromise is then to use learning only for some of the ambiguous classes, in order to boost performances. In this context, this paper studies the impact of pairwise constraints to unsupervised Spectral Clustering. We introduce a new generalization of constraint propagation which maximizes partitioning quality while reducing annotation costs. Experiments show the efficiency of the proposed scheme.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116995868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DeepSketch: Deep convolutional neural networks for sketch recognition and similarity search DeepSketch:用于草图识别和相似性搜索的深度卷积神经网络
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153606
Omar Seddati, S. Dupont, S. Mahmoudi
In this paper, we present a system for sketch classification and similarity search. We used deep convolution neural networks (ConvNets), state of the art in the field of image recognition. They enable both classification and medium/highlevel features extraction. We make use of ConvNets features as a basis for similarity search using k-Nearest Neighbors (kNN). Evaluation are performed on the TU-Berlin benchmark. Our main contributions are threefold: first, we use ConvNets in contrast to most previous approaches based essentially on hand crafted features. Secondly, we propose a ConvNet that is both more accurate and lighter/faster than the two only previous attempts at making use of ConvNets for handsketch recognition. We reached an accuracy of 75.42%. Third, we shown that similarly to their application on natural images, ConvNets allow the extraction of medium-level and high-level features (depending on the depth) which can be used for similarity search.1
本文提出了一个素描分类与相似度搜索系统。我们使用了深度卷积神经网络(ConvNets),这是图像识别领域的最新技术。它们支持分类和中/高级特征提取。我们利用卷积神经网络的特征作为使用k-最近邻(kNN)进行相似性搜索的基础。评估是在TU-Berlin基准上进行的。我们的主要贡献有三个方面:首先,我们使用卷积神经网络,而不是之前大多数基于手工制作特征的方法。其次,我们提出了一种卷积神经网络,它比之前两次使用卷积神经网络进行手绘识别的尝试更准确,更轻/更快。我们达到了75.42%的准确率。第三,我们表明,与自然图像的应用类似,卷积神经网络允许提取中级和高级特征(取决于深度),这些特征可用于相似性搜索
{"title":"DeepSketch: Deep convolutional neural networks for sketch recognition and similarity search","authors":"Omar Seddati, S. Dupont, S. Mahmoudi","doi":"10.1109/CBMI.2015.7153606","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153606","url":null,"abstract":"In this paper, we present a system for sketch classification and similarity search. We used deep convolution neural networks (ConvNets), state of the art in the field of image recognition. They enable both classification and medium/highlevel features extraction. We make use of ConvNets features as a basis for similarity search using k-Nearest Neighbors (kNN). Evaluation are performed on the TU-Berlin benchmark. Our main contributions are threefold: first, we use ConvNets in contrast to most previous approaches based essentially on hand crafted features. Secondly, we propose a ConvNet that is both more accurate and lighter/faster than the two only previous attempts at making use of ConvNets for handsketch recognition. We reached an accuracy of 75.42%. Third, we shown that similarly to their application on natural images, ConvNets allow the extraction of medium-level and high-level features (depending on the depth) which can be used for similarity search.1","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128756009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Hierarchical clustering pseudo-relevance feedback for social image search result diversification 面向社交图像搜索结果多样化的分层聚类伪相关反馈
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153613
B. Boteanu, Ionut Mironica, B. Ionescu
This article addresses the issue of social image search result diversification. We propose a novel perspective for the diversification problem via Relevance Feedback (RF). Traditional RF introduces the user in the processing loop by harvesting feedback about the relevance of the search results. This information is used for recomputing a better representation of the data needed. The novelty of our work is in exploiting this concept in a completely automated manner via pseudo-relevance, while pushing in priority the diversification of the results, rather than relevance. User feedback is simulated automatically by selecting positive and negative examples with regard to relevance, from the initial query results. Unsupervised hierarchical clustering is used to re-group images according to their content. Diversification is finally achieved with a re-ranking approach. Experimental validation on Flickr data shows the advantages of this approach.
本文讨论了社交图像搜索结果多样化的问题。我们提出了一种新的视角,通过相关反馈(RF)来研究多元化问题。传统RF通过收集有关搜索结果相关性的反馈,将用户引入处理循环。该信息用于重新计算所需数据的更好表示。我们工作的新颖之处在于通过伪相关性以完全自动化的方式利用这一概念,同时优先推动结果的多样化,而不是相关性。通过从初始查询结果中选择与相关性相关的正面和负面示例,自动模拟用户反馈。根据图像的内容,采用无监督分层聚类对图像进行重新分组。通过重新排序的方法,最终实现了多元化。对Flickr数据的实验验证表明了这种方法的优点。
{"title":"Hierarchical clustering pseudo-relevance feedback for social image search result diversification","authors":"B. Boteanu, Ionut Mironica, B. Ionescu","doi":"10.1109/CBMI.2015.7153613","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153613","url":null,"abstract":"This article addresses the issue of social image search result diversification. We propose a novel perspective for the diversification problem via Relevance Feedback (RF). Traditional RF introduces the user in the processing loop by harvesting feedback about the relevance of the search results. This information is used for recomputing a better representation of the data needed. The novelty of our work is in exploiting this concept in a completely automated manner via pseudo-relevance, while pushing in priority the diversification of the results, rather than relevance. User feedback is simulated automatically by selecting positive and negative examples with regard to relevance, from the initial query results. Unsupervised hierarchical clustering is used to re-group images according to their content. Diversification is finally achieved with a re-ranking approach. Experimental validation on Flickr data shows the advantages of this approach.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115390870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Learning to hash faces using large feature vectors 学习使用大型特征向量来散列人脸
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153611
C. E. Santos, Ewa Kijak, G. Gravier, W. R. Schwartz
Face recognition has been largely studied in past years. However, most of the related work focus on increasing accuracy and/or speed to test a single pair probe-subject. In this work, we present a novel method inspired by the success of locality sensing hashing (LSH) applied to large general purpose datasets and by the robustness provided by partial least squares (PLS) analysis when applied to large sets of feature vectors for face recognition. The result is a robust hashing method compatible with feature combination for fast computation of a short list of candidates in a large gallery of subjects. We provide theoretical support and practical principles for the proposed method that may be reused in further development of hash functions applied to face galleries. The proposed method is evaluated on the FERET and FRGCv1 datasets and compared to other methods in the literature. Experimental results show that the proposed approach is able to speedup 16 times compared to scanning all subjects in the face gallery.
在过去的几年里,人脸识别得到了大量的研究。然而,大多数相关工作都集中在提高测试单个对探针主体的准确性和/或速度上。在这项工作中,我们提出了一种新的方法,灵感来自于局部感知哈希(LSH)应用于大型通用数据集的成功,以及偏最小二乘(PLS)分析在应用于人脸识别的大型特征向量集时提供的鲁棒性。结果是一种与特征组合兼容的鲁棒哈希方法,用于快速计算大型主题库中的候选短列表。我们为所提出的方法提供了理论支持和实践原则,这些方法可以在应用于人脸库的哈希函数的进一步开发中重用。该方法在FERET和FRGCv1数据集上进行了评估,并与文献中的其他方法进行了比较。实验结果表明,与对人脸库中的所有受试者进行扫描相比,该方法的速度提高了16倍。
{"title":"Learning to hash faces using large feature vectors","authors":"C. E. Santos, Ewa Kijak, G. Gravier, W. R. Schwartz","doi":"10.1109/CBMI.2015.7153611","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153611","url":null,"abstract":"Face recognition has been largely studied in past years. However, most of the related work focus on increasing accuracy and/or speed to test a single pair probe-subject. In this work, we present a novel method inspired by the success of locality sensing hashing (LSH) applied to large general purpose datasets and by the robustness provided by partial least squares (PLS) analysis when applied to large sets of feature vectors for face recognition. The result is a robust hashing method compatible with feature combination for fast computation of a short list of candidates in a large gallery of subjects. We provide theoretical support and practical principles for the proposed method that may be reused in further development of hash functions applied to face galleries. The proposed method is evaluated on the FERET and FRGCv1 datasets and compared to other methods in the literature. Experimental results show that the proposed approach is able to speedup 16 times compared to scanning all subjects in the face gallery.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121556826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Interactive detection of incrementally learned concepts in images with ranking and semantic query interpretation 基于排序和语义查询解释的图像增量学习概念交互检测
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153623
K. Schutte, H. Bouma, J. Schavemaker, L. Daniele, Maya Sappelli, G. Koot, P. Eendebak, G. Azzopardi, Martijn Spitters, M. D. Boer, M. Kruithof, Paul Brandt
The number of networked cameras is growing exponentially. Multiple applications in different domains result in an increasing need to search semantically over video sensor data. In this paper, we present the GOOSE demonstrator, which is a real-time general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. Top-down, this demonstrator interprets queries, which are presented as an intuitive graph to collect user feedback. Bottom-up, the system automatically recognizes and localizes concepts in images and it can incrementally learn novel concepts. A smart ranking combines both and allows effective retrieval of relevant images.
联网摄像机的数量呈指数级增长。不同领域的多种应用导致对视频传感器数据进行语义搜索的需求日益增加。在本文中,我们展示了GOOSE演示器,它是一个实时通用搜索引擎,允许用户提出自然语言查询来检索相应的图像。自顶向下,该演示器解释查询,并将查询呈现为直观的图形,以收集用户反馈。自下而上,系统自动识别和定位图像中的概念,并逐步学习新概念。智能排名将两者结合起来,并允许有效地检索相关图像。
{"title":"Interactive detection of incrementally learned concepts in images with ranking and semantic query interpretation","authors":"K. Schutte, H. Bouma, J. Schavemaker, L. Daniele, Maya Sappelli, G. Koot, P. Eendebak, G. Azzopardi, Martijn Spitters, M. D. Boer, M. Kruithof, Paul Brandt","doi":"10.1109/CBMI.2015.7153623","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153623","url":null,"abstract":"The number of networked cameras is growing exponentially. Multiple applications in different domains result in an increasing need to search semantically over video sensor data. In this paper, we present the GOOSE demonstrator, which is a real-time general-purpose search engine that allows users to pose natural language queries to retrieve corresponding images. Top-down, this demonstrator interprets queries, which are presented as an intuitive graph to collect user feedback. Bottom-up, the system automatically recognizes and localizes concepts in images and it can incrementally learn novel concepts. A smart ranking combines both and allows effective retrieval of relevant images.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127662600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Temporal re-scoring vs. temporal descriptors for semantic indexing of videos 视频语义索引的时间重新评分与时间描述符
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153626
Abdelkader Hamadi, P. Mulhem, G. Quénot
The automated indexing of image and video is a difficult problem because of the “distance” between the arrays of numbers encoding these documents and the concepts (e.g. people, places, events or objects) with which we wish to annotate them. Methods exist for this but their results are far from satisfactory in terms of generality and accuracy. Existing methods typically use a single set of such examples and consider it as uniform. This is not optimal because the same concept may appear in various contexts and its appearance may be very different depending upon these contexts. The context has been widely used in the state of the art to treat various problems. However, the temporal context seems to be the most crucial and the most effective for the case of videos. In this paper, we present a comparative study between two methods exploiting the temporal context for semantic video indexing. The proposed approaches use temporal information that is derived from two different sources: low-level content and semantic information. Our experiments on TRECVID'12 collection showed interesting results that confirm the usefulness of the temporal context and demonstrate which of the two approaches is more effective.
图像和视频的自动索引是一个难题,因为编码这些文档的数字数组与我们希望注释它们的概念(例如人物、地点、事件或对象)之间存在“距离”。这方面已有方法,但其结果在通用性和准确性方面远不能令人满意。现有的方法通常使用一组这样的例子,并认为它是统一的。这不是最优的,因为相同的概念可能出现在不同的上下文中,并且根据这些上下文中,它的外观可能非常不同。上下文已被广泛应用于当前的技术水平来处理各种问题。然而,对于视频来说,时间背景似乎是最关键和最有效的。在本文中,我们对两种利用时间上下文进行语义视频索引的方法进行了比较研究。所提出的方法使用来自两个不同来源的时间信息:低级内容和语义信息。我们在TRECVID'12收集上的实验显示了有趣的结果,证实了时间上下文的有用性,并证明了两种方法中哪一种更有效。
{"title":"Temporal re-scoring vs. temporal descriptors for semantic indexing of videos","authors":"Abdelkader Hamadi, P. Mulhem, G. Quénot","doi":"10.1109/CBMI.2015.7153626","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153626","url":null,"abstract":"The automated indexing of image and video is a difficult problem because of the “distance” between the arrays of numbers encoding these documents and the concepts (e.g. people, places, events or objects) with which we wish to annotate them. Methods exist for this but their results are far from satisfactory in terms of generality and accuracy. Existing methods typically use a single set of such examples and consider it as uniform. This is not optimal because the same concept may appear in various contexts and its appearance may be very different depending upon these contexts. The context has been widely used in the state of the art to treat various problems. However, the temporal context seems to be the most crucial and the most effective for the case of videos. In this paper, we present a comparative study between two methods exploiting the temporal context for semantic video indexing. The proposed approaches use temporal information that is derived from two different sources: low-level content and semantic information. Our experiments on TRECVID'12 collection showed interesting results that confirm the usefulness of the temporal context and demonstrate which of the two approaches is more effective.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123303853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable video representation 可解释的视频表示
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153602
Lukas Diem, M. Zaharieva
The immense amount of available video data poses novel requirements for video representation approaches by means of focusing on central and relevant aspects of the underlying story and facilitating the efficient overview and assessment of the content. In general, the assessment of content relevance and significance is a high-level task that usually requires for human intervention. However, some filming techniques imply importance and bear the potential for automated content-based analysis. For example, core elements in a movie (such as the main characters and central objects) are often emphasized by repeated occurrence. In this paper we present a new approach for the automated detection of such recurring elements in video sequences that provides a compact and interpretable content representation. Performed experiments outline the challenges and the potential of the algorithm for automated high-level video analysis.
大量可用的视频数据通过关注潜在故事的中心和相关方面以及促进对内容的有效概述和评估,对视频表示方法提出了新的要求。一般来说,内容相关性和重要性的评估是一项高级任务,通常需要人工干预。然而,一些拍摄技术暗示了重要性,并承担了基于内容的自动化分析的潜力。例如,电影中的核心元素(如主角和中心物体)经常通过重复出现来强调。在本文中,我们提出了一种新的方法来自动检测视频序列中这种重复出现的元素,它提供了一个紧凑的和可解释的内容表示。进行的实验概述了该算法在自动化高级视频分析中的挑战和潜力。
{"title":"Interpretable video representation","authors":"Lukas Diem, M. Zaharieva","doi":"10.1109/CBMI.2015.7153602","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153602","url":null,"abstract":"The immense amount of available video data poses novel requirements for video representation approaches by means of focusing on central and relevant aspects of the underlying story and facilitating the efficient overview and assessment of the content. In general, the assessment of content relevance and significance is a high-level task that usually requires for human intervention. However, some filming techniques imply importance and bear the potential for automated content-based analysis. For example, core elements in a movie (such as the main characters and central objects) are often emphasized by repeated occurrence. In this paper we present a new approach for the automated detection of such recurring elements in video sequences that provides a compact and interpretable content representation. Performed experiments outline the challenges and the potential of the algorithm for automated high-level video analysis.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"23 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116518793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A factorized model for multiple SVM and multi-label classification for large scale multimedia indexing 大型多媒体索引的多支持向量机多标签分类分解模型
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153610
Bahjat Safadi, G. Quénot
This paper presents a set of improvements for SVM-based large scale multimedia indexing. The proposed method is particularly suited for the detection of many target concepts at once and for highly imbalanced classes (very infrequent concepts). The method is based on the use of multiple SVMs (MSVM) for dealing with the class imbalance and on some adaptations of this approach in order to allow for an efficient implementation using optimized linear algebra routines. The implementation also involves hashed structures allowing the factorization of computations between the multiple SVMs and the multiple target concepts, and is denoted as Factorized-MSVM. Experiments were conducted on a large-scale dataset, namely TRECVid 2012 semantic indexing task. Results show that the Factorized-MSVM performs as well as the original MSVM, but it is significantly much faster. Speed-ups by factors of several hundreds were obtained for the simultaneous classification of 346 concepts, when compared to the original MSVM implementation using the popular libSVM implementation.
本文提出了一套基于支持向量机的大规模多媒体索引的改进方案。所提出的方法特别适合于一次检测许多目标概念和高度不平衡的类(非常不常见的概念)。该方法基于使用多个支持向量机(MSVM)来处理类不平衡,并对该方法进行了一些调整,以便使用优化的线性代数例程实现有效的实现。该实现还涉及散列结构,允许在多个svm和多个目标概念之间进行计算分解,并表示为Factorized-MSVM。实验在TRECVid 2012语义索引任务这一大规模数据集上进行。结果表明,分解后的MSVM性能与原始MSVM相当,但速度明显快得多。与使用流行的libSVM实现的原始MSVM实现相比,对346个概念进行同时分类的速度提高了数百倍。
{"title":"A factorized model for multiple SVM and multi-label classification for large scale multimedia indexing","authors":"Bahjat Safadi, G. Quénot","doi":"10.1109/CBMI.2015.7153610","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153610","url":null,"abstract":"This paper presents a set of improvements for SVM-based large scale multimedia indexing. The proposed method is particularly suited for the detection of many target concepts at once and for highly imbalanced classes (very infrequent concepts). The method is based on the use of multiple SVMs (MSVM) for dealing with the class imbalance and on some adaptations of this approach in order to allow for an efficient implementation using optimized linear algebra routines. The implementation also involves hashed structures allowing the factorization of computations between the multiple SVMs and the multiple target concepts, and is denoted as Factorized-MSVM. Experiments were conducted on a large-scale dataset, namely TRECVid 2012 semantic indexing task. Results show that the Factorized-MSVM performs as well as the original MSVM, but it is significantly much faster. Speed-ups by factors of several hundreds were obtained for the simultaneous classification of 346 concepts, when compared to the original MSVM implementation using the popular libSVM implementation.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114664469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On the use of statistical semantics for metadata-based social image retrieval 统计语义在基于元数据的社会图像检索中的应用
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153634
Navid Rekabsaz, R. Bierig, B. Ionescu, A. Hanbury, M. Lupu
We revisit text-based image retrieval for social media, exploring the opportunities offered by statistical semantics. We assess the performance and limitation of several complementary corpus-based semantic text similarity methods in combination with word representations. We compare results with state-of-the-art text search engines. Our deep learning-based semantic retrieval methods show a statistically significant improvement in comparison to a best practice Solr search engine, at the expense of a significant increase in processing time. We provide a solution for reducing the semantic processing time up to 48% compared to the standard approach, while achieving the same performance.
我们重新审视基于文本的社交媒体图像检索,探索统计语义提供的机会。我们评估了几种互补的基于语料库的与词表示相结合的语义文本相似度方法的性能和局限性。我们将结果与最先进的文本搜索引擎进行比较。与最佳实践的Solr搜索引擎相比,我们基于深度学习的语义检索方法在统计上有了显著的改进,但代价是处理时间显著增加。我们提供了一种解决方案,与标准方法相比,可以将语义处理时间减少48%,同时实现相同的性能。
{"title":"On the use of statistical semantics for metadata-based social image retrieval","authors":"Navid Rekabsaz, R. Bierig, B. Ionescu, A. Hanbury, M. Lupu","doi":"10.1109/CBMI.2015.7153634","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153634","url":null,"abstract":"We revisit text-based image retrieval for social media, exploring the opportunities offered by statistical semantics. We assess the performance and limitation of several complementary corpus-based semantic text similarity methods in combination with word representations. We compare results with state-of-the-art text search engines. Our deep learning-based semantic retrieval methods show a statistically significant improvement in comparison to a best practice Solr search engine, at the expense of a significant increase in processing time. We provide a solution for reducing the semantic processing time up to 48% compared to the standard approach, while achieving the same performance.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114820707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic detection of repetitive actions in a video 自动检测视频中的重复动作
Pub Date : 2015-06-10 DOI: 10.1109/CBMI.2015.7153605
Hassan Wehbe, P. Joly, B. Haidar
In this paper we propose a method to locate inloop repetitions in a video. An in-loop repetition consists in repeating the same action(s) many times consecutively. The proposed method adapts and uses the auto-correlation method YIN, originally proposed to find the fundamental frequency in audio signals. Based on this technique, we propose a method that generates a matrix where repetitions correspond to triangle-shaped zones of low values in this matrix (we called YIN-Matrix). Locating these triangles leads to locate video segments that enclose a repetition as well as to extract their parameters. In order to evaluate our method, we used a standard evaluation method that shows the error rates compared to ground-truth information. According to this evaluation method, our method shows promising results that nominate it to form a solid base for future works.
在本文中,我们提出了一种定位视频中循环重复的方法。循环重复是指连续多次重复相同的动作。该方法适应并使用了最初提出的自相关方法YIN来寻找音频信号中的基频。基于这种技术,我们提出了一种生成矩阵的方法,其中重复对应于该矩阵中低值的三角形区域(我们称为YIN-Matrix)。定位这些三角形可以定位包含重复的视频片段,并提取其参数。为了评估我们的方法,我们使用了一种标准的评估方法,该方法显示了与真实信息相比的错误率。根据该评价方法,我们的方法显示了良好的结果,为今后的工作奠定了坚实的基础。
{"title":"Automatic detection of repetitive actions in a video","authors":"Hassan Wehbe, P. Joly, B. Haidar","doi":"10.1109/CBMI.2015.7153605","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153605","url":null,"abstract":"In this paper we propose a method to locate inloop repetitions in a video. An in-loop repetition consists in repeating the same action(s) many times consecutively. The proposed method adapts and uses the auto-correlation method YIN, originally proposed to find the fundamental frequency in audio signals. Based on this technique, we propose a method that generates a matrix where repetitions correspond to triangle-shaped zones of low values in this matrix (we called YIN-Matrix). Locating these triangles leads to locate video segments that enclose a repetition as well as to extract their parameters. In order to evaluate our method, we used a standard evaluation method that shows the error rates compared to ground-truth information. According to this evaluation method, our method shows promising results that nominate it to form a solid base for future works.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115207395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1