首页 > 最新文献

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文 中文
Simple tag-based subclass representations for visually-varied image classes 用于视觉变化的图像类的简单的基于标记的子类表示
Pub Date : 2016-06-30 DOI: 10.1109/CBMI.2016.7500265
Xinchao Li, Peng Xu, Yue Shi, M. Larson, A. Hanjalic
In this paper, we present a subclass-representation approach that predicts the probability of a social image belonging to one particular class. We explore the co-occurrence of user-contributed tags to find subclasses with a strong connection to the top level class. We then project each image onto the resulting subclass space, generating a subclass representation for the image. The advantage of our tag-based subclasses is that they have a chance of being more visually stable and easier to model than top-level classes. Our contribution is to demonstrate that a simple and inexpensive method for generating sub-class representations has the ability to improve classification results in the case of tag classes that are visually highly heterogenous. The approach is evaluated on a set of 1 million photos with 10 top-level classes, from the dataset released by the ACM Multimedia 2013 Yahoo! Large-scale Flickr-tag Image Classification Grand Challenge. Experiments show that the proposed system delivers sound performance for visually diverse classes compared with methods that directly model top classes.
在本文中,我们提出了一种子类表示方法来预测社会图像属于一个特定类别的概率。我们探索用户贡献标签的共存,以找到与顶级类有强连接的子类。然后我们将每个图像投影到生成的子类空间上,为图像生成一个子类表示。我们基于标记的子类的优点是,它们有机会在视觉上比顶级类更稳定,更容易建模。我们的贡献是证明了一种简单而廉价的方法,用于生成子类表示,能够在视觉上高度异构的标记类的情况下改善分类结果。该方法在一组100万张照片上进行了评估,其中包含10个顶级类,这些照片来自ACM Multimedia 2013 Yahoo!大规模闪烁标签图像分类大挑战。实验表明,与直接建模顶级类的方法相比,所提出的系统在视觉上不同的类上提供了良好的性能。
{"title":"Simple tag-based subclass representations for visually-varied image classes","authors":"Xinchao Li, Peng Xu, Yue Shi, M. Larson, A. Hanjalic","doi":"10.1109/CBMI.2016.7500265","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500265","url":null,"abstract":"In this paper, we present a subclass-representation approach that predicts the probability of a social image belonging to one particular class. We explore the co-occurrence of user-contributed tags to find subclasses with a strong connection to the top level class. We then project each image onto the resulting subclass space, generating a subclass representation for the image. The advantage of our tag-based subclasses is that they have a chance of being more visually stable and easier to model than top-level classes. Our contribution is to demonstrate that a simple and inexpensive method for generating sub-class representations has the ability to improve classification results in the case of tag classes that are visually highly heterogenous. The approach is evaluated on a set of 1 million photos with 10 top-level classes, from the dataset released by the ACM Multimedia 2013 Yahoo! Large-scale Flickr-tag Image Classification Grand Challenge. Experiments show that the proposed system delivers sound performance for visually diverse classes compared with methods that directly model top classes.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128881157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Music Tweet Map: A browsing interface to explore the microblogosphere of music 音乐推特地图:一个浏览界面,探索音乐的微博圈
Pub Date : 2016-06-30 DOI: 10.1109/CBMI.2016.7500277
D. Hauger, M. Schedl
In this demo paper, we present the “Music Tweet Map” interface for browsing music listening events on a global scale. These events have been extracted automatically from a large set of microblogs harvested from Twitter. We showcase the major functionalities offered by the interface, i.e., browsing music by time, specific locations, topic clusters learned from tag information, and music charts. Furthermore, music can be explored via artist similarity. To this end, we present a music similarity measure, based on co-occurrence analysis of items in users' listening histories.
在这篇演示论文中,我们展示了用于在全球范围内浏览音乐收听事件的“音乐Tweet地图”界面。这些事件是从Twitter收集的大量微博中自动提取出来的。我们展示了界面提供的主要功能,即按时间、特定地点、从标签信息中学习的主题聚类和音乐图表浏览音乐。此外,音乐可以通过艺术家的相似性来探索。为此,我们提出了一种基于用户收听历史中项目的共现分析的音乐相似性度量。
{"title":"Music Tweet Map: A browsing interface to explore the microblogosphere of music","authors":"D. Hauger, M. Schedl","doi":"10.1109/CBMI.2016.7500277","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500277","url":null,"abstract":"In this demo paper, we present the “Music Tweet Map” interface for browsing music listening events on a global scale. These events have been extracted automatically from a large set of microblogs harvested from Twitter. We showcase the major functionalities offered by the interface, i.e., browsing music by time, specific locations, topic clusters learned from tag information, and music charts. Furthermore, music can be explored via artist similarity. To this end, we present a music similarity measure, based on co-occurrence analysis of items in users' listening histories.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124249702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A novel architecture of semantic web reasoner based on transferable belief model 一种基于可转移信念模型的语义web推理器结构
Pub Date : 2016-06-30 DOI: 10.1109/CBMI.2016.7500269
C. Pantoja, E. Izquierdo
As the Semantic Web gains popularity, the technical challenges of representing and reasoning with imprecise and uncertain information still remain an outstanding issue. The foundation of the Semantic Web is the assertion of relations between entities, but these relations usually do not carry a degree or level of relationship. Using a simple subject-predicate-object triple we can say that “Alice” (subject) “likes” (predicate) “Rock music” (object), but we can not say that she does so with a confidence or level of 80%. We propose the use of the Transferable Belief Model (TBM) as a way to achieve this. Two contributions are presented in this work: an ontology to represent information in a way which is consistent with the TBM, and a reasoner to assess the knowledge of a given system. Tests show the feasibility of applying this model on large scale Semantic Web information, but further optimisations and tests must be performed.
随着语义Web的普及,对不精确和不确定的信息进行表示和推理的技术挑战仍然是一个突出的问题。语义网的基础是实体之间关系的断言,但这些关系通常没有程度或级别的关系。使用一个简单的主语-谓语-宾语三元组,我们可以说“Alice”(主语)“喜欢”(谓语)“摇滚音乐”(宾语),但我们不能以80%的置信度或水平说她这样做。我们建议使用可转移信念模型(TBM)作为实现这一目标的一种方法。在这项工作中提出了两个贡献:一个以与TBM一致的方式表示信息的本体,以及一个评估给定系统知识的推理器。测试表明该模型在大规模语义网信息上的应用是可行的,但还需要进一步的优化和测试。
{"title":"A novel architecture of semantic web reasoner based on transferable belief model","authors":"C. Pantoja, E. Izquierdo","doi":"10.1109/CBMI.2016.7500269","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500269","url":null,"abstract":"As the Semantic Web gains popularity, the technical challenges of representing and reasoning with imprecise and uncertain information still remain an outstanding issue. The foundation of the Semantic Web is the assertion of relations between entities, but these relations usually do not carry a degree or level of relationship. Using a simple subject-predicate-object triple we can say that “Alice” (subject) “likes” (predicate) “Rock music” (object), but we can not say that she does so with a confidence or level of 80%. We propose the use of the Transferable Belief Model (TBM) as a way to achieve this. Two contributions are presented in this work: an ontology to represent information in a way which is consistent with the TBM, and a reasoner to assess the knowledge of a given system. Tests show the feasibility of applying this model on large scale Semantic Web information, but further optimisations and tests must be performed.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125553359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multimedia interactive search engine based on graph-based and non-linear multimodal fusion 基于图形和非线性多模态融合的多媒体交互式搜索引擎
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500276
A. Moumtzidou, Ilias Gialampoukidis, Theodoros Mironidis, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris
This paper presents an interactive multimedia search engine, which is capable of searching into multimedia collections by fusing textual and visual information. Apart from multimedia search, the engine is able to perform text search and image retrieval independently using both high-level and low-level information. The images of the multimedia collection are organized by color, offering fast browsing in the image collection.
本文提出了一种交互式多媒体搜索引擎,通过融合文本信息和视觉信息,实现对多媒体馆藏的搜索。除了多媒体搜索之外,该引擎还能够使用高级和低级信息独立执行文本搜索和图像检索。多媒体集合的图像按颜色组织,提供快速浏览图像集合。
{"title":"A multimedia interactive search engine based on graph-based and non-linear multimodal fusion","authors":"A. Moumtzidou, Ilias Gialampoukidis, Theodoros Mironidis, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris","doi":"10.1109/CBMI.2016.7500276","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500276","url":null,"abstract":"This paper presents an interactive multimedia search engine, which is capable of searching into multimedia collections by fusing textual and visual information. Apart from multimedia search, the engine is able to perform text search and image retrieval independently using both high-level and low-level information. The images of the multimedia collection are organized by color, offering fast browsing in the image collection.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124394763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Flow Cytometry based automatic MRD assessment in Acute Lymphoblastic Leukaemia: Longitudinal evaluation of time-specific cell population models 基于流式细胞术的急性淋巴细胞白血病自动MRD评估:时间特异性细胞群模型的纵向评估
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500274
R. Licandro, Paolo Rota, M. Reiter, M. Kampel
Acute Lymphoblastic Leukaemia (ALL) is a disease induced by genetic lesion of blood progenitor cells, which influences the hematopoiesis, resulting in the proliferation of undifferentiated (leukaemic) cells. The Minimal Residual Disease (MRD) value is used to quantify these cells and is reliably assessable using Flow CytoMetry (FCM) based measurements. It is a powerful predictor for treatment response and thus used as diagnostic tool for planning patient's individual therapy. In this work we propose an evaluation scheme for longitudinal disease stadium dependent MRD assessment performed on collected clinical data of B-ALL cases after 15, 33 and 78 days of therapy, guided according to the standardised AIEOP-BFM2009 treatment protocol. We compare the blast classification performance using time-specific population models, which are trained using two different core approaches: generative and discriminative. The results show that cell populations change dependent on the observed treatment day and it is identified that a time-specific model of day 15 is not suitable to estimate leukaemic cell populations at treatment day 33 and 78, independent of the methodologies evaluated.
急性淋巴细胞白血病(Acute Lymphoblastic leukemia, ALL)是一种由血液祖细胞的遗传性病变引起的疾病,它影响造血功能,导致未分化(白血病)细胞的增殖。最小残留病(MRD)值用于量化这些细胞,并使用基于流式细胞术(FCM)的测量可靠地评估。它是治疗反应的有力预测因子,因此被用作计划患者个体治疗的诊断工具。在本研究中,我们根据标准化的AIEOP-BFM2009治疗方案,对B-ALL患者在治疗15、33和78天后收集的临床资料进行纵向疾病场依赖性MRD评估,提出了一种评估方案。我们比较爆炸分类性能使用特定时间的人口模型,这是使用两种不同的核心方法训练:生成和判别。结果表明,细胞群的变化依赖于观察到的治疗日,并且确定了第15天的时间特异性模型不适合估计治疗第33和78天的白血病细胞群,独立于评估的方法。
{"title":"Flow Cytometry based automatic MRD assessment in Acute Lymphoblastic Leukaemia: Longitudinal evaluation of time-specific cell population models","authors":"R. Licandro, Paolo Rota, M. Reiter, M. Kampel","doi":"10.1109/CBMI.2016.7500274","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500274","url":null,"abstract":"Acute Lymphoblastic Leukaemia (ALL) is a disease induced by genetic lesion of blood progenitor cells, which influences the hematopoiesis, resulting in the proliferation of undifferentiated (leukaemic) cells. The Minimal Residual Disease (MRD) value is used to quantify these cells and is reliably assessable using Flow CytoMetry (FCM) based measurements. It is a powerful predictor for treatment response and thus used as diagnostic tool for planning patient's individual therapy. In this work we propose an evaluation scheme for longitudinal disease stadium dependent MRD assessment performed on collected clinical data of B-ALL cases after 15, 33 and 78 days of therapy, guided according to the standardised AIEOP-BFM2009 treatment protocol. We compare the blast classification performance using time-specific population models, which are trained using two different core approaches: generative and discriminative. The results show that cell populations change dependent on the observed treatment day and it is identified that a time-specific model of day 15 is not suitable to estimate leukaemic cell populations at treatment day 33 and 78, independent of the methodologies evaluated.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127210034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Near-duplicate video detection based on an approximate similarity self-join strategy 基于近似相似度自连接策略的近重复视频检测
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500278
H. B. D. Silva, Zenilton K. G. Patrocínio, G. Gravier, L. Amsaleg, A. Araújo, S. Guimarães
The huge amount of redundant multimedia data, like video, has become a problem in terms of both space and copyright. Usually, the methods for identifying near-duplicate videos are neither adequate nor scalable to find pairs of similar videos. Similarity self-join operation could be an alternative to solve this problem in which all similar pairs of elements from a video dataset are retrieved. Nonetheless, methods for similarity self-join have poor performance when applied to high-dimensional data. In this work, we propose a new approximate method to compute similarity self-join in sub-quadratic time in order to solve the near-duplicate video detection problem. Our strategy is based on clustering techniques to find out groups of videos which are similar to each other.
大量冗余的多媒体数据,如视频,已经成为空间和版权方面的问题。通常,用于识别近重复视频的方法既不充分,也不能扩展到查找相似视频对。相似性自连接操作可能是解决该问题的一种替代方法,其中从视频数据集中检索所有相似的元素对。然而,相似性自连接方法在应用于高维数据时性能较差。为了解决近重复视频检测问题,我们提出了一种新的近似方法在次二次时间内计算相似度自连接。我们的策略是基于聚类技术来找出彼此相似的视频组。
{"title":"Near-duplicate video detection based on an approximate similarity self-join strategy","authors":"H. B. D. Silva, Zenilton K. G. Patrocínio, G. Gravier, L. Amsaleg, A. Araújo, S. Guimarães","doi":"10.1109/CBMI.2016.7500278","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500278","url":null,"abstract":"The huge amount of redundant multimedia data, like video, has become a problem in terms of both space and copyright. Usually, the methods for identifying near-duplicate videos are neither adequate nor scalable to find pairs of similar videos. Similarity self-join operation could be an alternative to solve this problem in which all similar pairs of elements from a video dataset are retrieved. Nonetheless, methods for similarity self-join have poor performance when applied to high-dimensional data. In this work, we propose a new approximate method to compute similarity self-join in sub-quadratic time in order to solve the near-duplicate video detection problem. Our strategy is based on clustering techniques to find out groups of videos which are similar to each other.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127795170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Is the vascular network discriminant enough to classify renal cell carcinoma? 血管网络是否足以区分肾细胞癌?
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500255
Alexis Zubiolo, E. Debreuve, D. Ambrosetti, P. Pognonec, X. Descombes
The renal cell carcinoma (RCC) is the most frequent type of kidney cancer (between 90% and 95%). Twelve subtypes of RCC can be distinguished, among which the clear cell carcinoma (ccRCC) and the papillary carcinoma (pRCC) are the two most common ones (75% and 10% of the cases, respectively). After resection (i.e., surgical removal), the tumor is prepared for histological examination (fixation, slicing, staining, observation with a microscope). Along with protein expression and genetic tests, the histological study allows to classify the tumor and define its grade in order to make a prognosis and to take decisions for a potential additional chemotherapy treatment. Digital histology is a recent domain, since routinely, histological slices are studied directly under the microscope. The pioneer works deal with the automatic analysis of cells. However, a crucial factor for RCC classification is the tumoral architecture relying on the structure of the vascular network. For example, coarsely speaking, ccRCC is characterized by a “fishnet” structure while the pRCC has a tree-like structure. To our knowledge, no computerized analysis of the vascular network has been proposed yet. In this context, we developed a complete pipeline to extract the vascular network of a given histological slice and compute features of the underlying graph structure. Then, we studied the potential of such a feature-based approach in classifying a tumor into ccRCC or pRCC. Preliminary results on patient data are encouraging.
肾细胞癌(RCC)是肾癌中最常见的类型(90%至95%)。RCC可分为12种亚型,其中透明细胞癌(ccRCC)和乳头状癌(pRCC)是最常见的两种亚型(分别占75%和10%)。切除(即手术切除)后,准备肿瘤进行组织学检查(固定、切片、染色、显微镜观察)。通过蛋白质表达和基因检测,组织学研究可以对肿瘤进行分类并确定其级别,以便做出预后并决定是否进行潜在的额外化疗。数字组织学是最近的一个领域,因为通常,组织切片是直接在显微镜下研究的。先驱著作涉及细胞的自动分析。然而,RCC分类的一个关键因素是依赖于血管网络结构的肿瘤结构。例如,粗略地说,ccRCC具有“渔网”结构,而pRCC具有树状结构。据我们所知,还没有对血管网络进行计算机化分析的提议。在这种情况下,我们开发了一个完整的管道来提取给定组织学切片的血管网络,并计算底层图结构的特征。然后,我们研究了这种基于特征的方法将肿瘤分类为ccRCC或pRCC的潜力。患者数据的初步结果令人鼓舞。
{"title":"Is the vascular network discriminant enough to classify renal cell carcinoma?","authors":"Alexis Zubiolo, E. Debreuve, D. Ambrosetti, P. Pognonec, X. Descombes","doi":"10.1109/CBMI.2016.7500255","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500255","url":null,"abstract":"The renal cell carcinoma (RCC) is the most frequent type of kidney cancer (between 90% and 95%). Twelve subtypes of RCC can be distinguished, among which the clear cell carcinoma (ccRCC) and the papillary carcinoma (pRCC) are the two most common ones (75% and 10% of the cases, respectively). After resection (i.e., surgical removal), the tumor is prepared for histological examination (fixation, slicing, staining, observation with a microscope). Along with protein expression and genetic tests, the histological study allows to classify the tumor and define its grade in order to make a prognosis and to take decisions for a potential additional chemotherapy treatment. Digital histology is a recent domain, since routinely, histological slices are studied directly under the microscope. The pioneer works deal with the automatic analysis of cells. However, a crucial factor for RCC classification is the tumoral architecture relying on the structure of the vascular network. For example, coarsely speaking, ccRCC is characterized by a “fishnet” structure while the pRCC has a tree-like structure. To our knowledge, no computerized analysis of the vascular network has been proposed yet. In this context, we developed a complete pipeline to extract the vascular network of a given histological slice and compute features of the underlying graph structure. Then, we studied the potential of such a feature-based approach in classifying a tumor into ccRCC or pRCC. Preliminary results on patient data are encouraging.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122062635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Histograms of Motion Gradients for real-time video classification 实时视频分类的运动梯度直方图
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500260
Ionut Cosmin Duta, J. Uijlings, T. Nguyen, K. Aizawa, Alexander Hauptmann, B. Ionescu, N. Sebe
Besides appearance information, the video contains temporal evolution, which represents an important and useful source of information about its content. Many video representation approaches are based on the motion information within the video. The common approach to extract the motion information is to compute the optical flow from the vertical and the horizontal temporal evolution of two consecutive frames. However, the computation of optical flow is very demanding in terms of computational cost, in many cases being the most significant processing step within the overall pipeline of the target video analysis application. In this work we propose a very efficient approach to capture the motion information within the video. Our method is based on a simple temporal and spatial derivation, which captures the changes between two consecutive frames. The proposed descriptor, Histograms of Motion Gradients (HMG), is validated on the UCF50 human action recognition dataset. Our HMG pipeline with several additional speed-ups is able to achieve real-time video processing and outperforms several well-known descriptors including descriptors based on the costly optical flow.
除了外观信息外,视频还包含时间演变信息,这是了解视频内容的重要而有用的信息来源。许多视频表示方法都是基于视频中的运动信息。提取运动信息的常用方法是从连续两帧的垂直和水平时间演化中计算光流。然而,光流的计算在计算成本方面是非常苛刻的,在许多情况下是目标视频分析应用的整个流水线中最重要的处理步骤。在这项工作中,我们提出了一种非常有效的方法来捕获视频中的运动信息。我们的方法是基于一个简单的时间和空间推导,它捕获两个连续帧之间的变化。提出的描述符,运动梯度直方图(HMG),在UCF50人类动作识别数据集上进行了验证。我们的HMG管道具有几个额外的加速,能够实现实时视频处理,并且优于几种知名的描述符,包括基于昂贵的光流的描述符。
{"title":"Histograms of Motion Gradients for real-time video classification","authors":"Ionut Cosmin Duta, J. Uijlings, T. Nguyen, K. Aizawa, Alexander Hauptmann, B. Ionescu, N. Sebe","doi":"10.1109/CBMI.2016.7500260","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500260","url":null,"abstract":"Besides appearance information, the video contains temporal evolution, which represents an important and useful source of information about its content. Many video representation approaches are based on the motion information within the video. The common approach to extract the motion information is to compute the optical flow from the vertical and the horizontal temporal evolution of two consecutive frames. However, the computation of optical flow is very demanding in terms of computational cost, in many cases being the most significant processing step within the overall pipeline of the target video analysis application. In this work we propose a very efficient approach to capture the motion information within the video. Our method is based on a simple temporal and spatial derivation, which captures the changes between two consecutive frames. The proposed descriptor, Histograms of Motion Gradients (HMG), is validated on the UCF50 human action recognition dataset. Our HMG pipeline with several additional speed-ups is able to achieve real-time video processing and outperforms several well-known descriptors including descriptors based on the costly optical flow.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126188424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Fuzzy clustering of lecture videos based on topic modeling 基于主题建模的讲座视频模糊聚类
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500264
Subhasree Basu, Yi Yu, Roger Zimmermann
Lecture videos constitute an important part of the e-learning paradigm. These online video-lectures contain multimedia materials aimed at explaining complex concepts in a more effective way. The videos are mostly grouped by their subjects. However, often there are overlaps between the subjects, e.g. Mathematics and Statistics. Hence, educational content-wise, some of the lecture videos can belong to more than one subject. When they are labeled by only one subject, students searching for the content of the lecture might miss some of these videos. To solve this problem, we aim to provide a clustering of these lecture videos based on their educational content rather than their titles so that such lectures will not be missed out based on the subject labels. Our novel algorithm uses topic modeling on video transcripts generated by automatic captions to extract the contents of these videos. We choose representative text documents for each of the clusters from the Wikipedia. Then we calculate a similarity between the topics extracted from the videos and those of the representative documents of the clusters. Finally we apply fuzzy clustering based on these similarity values and provide a lecture-content based clustering for these lecture videos. The initial results are plausible and confirm the effectiveness of the proposed scheme.
讲座视频是电子学习模式的重要组成部分。这些在线视频讲座包含多媒体材料,旨在以更有效的方式解释复杂的概念。这些视频大多是按主题分组的。然而,科目之间经常有重叠,例如数学和统计学。因此,在教育内容方面,一些讲座视频可以属于多个主题。当它们只被标记为一个主题时,学生在搜索讲座内容时可能会错过一些视频。为了解决这个问题,我们的目标是根据这些讲座视频的教育内容而不是标题提供一个聚类,这样这些讲座就不会因为主题标签而被遗漏。我们的新算法在自动字幕生成的视频文本上使用主题建模来提取这些视频的内容。我们从维基百科中为每个集群选择具有代表性的文本文档。然后,我们计算从视频中提取的主题与聚类的代表性文档的主题之间的相似度。最后,我们基于这些相似度值应用模糊聚类,对这些讲座视频进行基于讲座内容的聚类。初步结果是合理的,并证实了该方案的有效性。
{"title":"Fuzzy clustering of lecture videos based on topic modeling","authors":"Subhasree Basu, Yi Yu, Roger Zimmermann","doi":"10.1109/CBMI.2016.7500264","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500264","url":null,"abstract":"Lecture videos constitute an important part of the e-learning paradigm. These online video-lectures contain multimedia materials aimed at explaining complex concepts in a more effective way. The videos are mostly grouped by their subjects. However, often there are overlaps between the subjects, e.g. Mathematics and Statistics. Hence, educational content-wise, some of the lecture videos can belong to more than one subject. When they are labeled by only one subject, students searching for the content of the lecture might miss some of these videos. To solve this problem, we aim to provide a clustering of these lecture videos based on their educational content rather than their titles so that such lectures will not be missed out based on the subject labels. Our novel algorithm uses topic modeling on video transcripts generated by automatic captions to extract the contents of these videos. We choose representative text documents for each of the clusters from the Wikipedia. Then we calculate a similarity between the topics extracted from the videos and those of the representative documents of the clusters. Finally we apply fuzzy clustering based on these similarity values and provide a lecture-content based clustering for these lecture videos. The initial results are plausible and confirm the effectiveness of the proposed scheme.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131225411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Classifying Salsa dance steps from skeletal poses 从骨骼姿势分类萨尔萨舞步
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500244
Sotiris Karavarsamis, D. Ververidis, G. Chantas, S. Nikolopoulos, Y. Kompatsiaris
In this paper, we explore building classifiers to detect Salsa dance step primitives in choreographies available in the Huawei 3DLife data set. These can collectively be an important component of dance tuition systems that support e-learning. A dance step is reasoned as the shortest possible extract of bodily motion that can uniquely identify a particularly repeatable movement through time. The representation of dance steps adopted is a concatenation of vectorized matrices involving the 3D coordinates of tracked body joints. Under this modeling context, a Salsa dance performance is seen as an ordered sequence of Salsa dance steps, requiring a multiple of the variables allocated in the representation of a single step. Following a previous work by Masurelle & Essid that discusses the classification of six Salsa dance steps from 3DLife, we show that it is possible to obtain better classifiers under a similar experimental protocol in terms of both test accuracy and F-measure. By carefully re-annotating the data in 3DLife, we refocus on the six-step classification problem and then extend the protocol to the case of 20 dance steps. In comparison to common classifiers of the trade operating on full-dimensions, we show that it is possible to produce more accurate models by computing a subspace of the data. At the same time it is possible to reduce problematic bias in resulting models due to the uneven distribution of samples across step data classes. We provide and discuss experimental findings to support both hypotheses for the two experimental settings.
在本文中,我们探索构建分类器来检测华为3DLife数据集中可用的舞蹈编排中的Salsa舞步原语。这些可以共同成为支持电子学习的舞蹈教学系统的重要组成部分。一个舞步被认为是最短的身体动作的提取,可以唯一地识别一个特定的可重复的动作。所采用的舞步表示是涉及被跟踪身体关节的三维坐标的矢量化矩阵的串联。在此建模上下文中,Salsa舞蹈表演被视为Salsa舞蹈步骤的有序序列,需要在单个步骤的表示中分配多个变量。Masurelle & Essid之前的工作讨论了3DLife中六个萨尔萨舞步的分类,我们表明,在类似的实验方案下,就测试精度和F-measure而言,有可能获得更好的分类器。通过在3DLife中仔细重新标注数据,我们重新关注六步分类问题,然后将协议扩展到20步的情况。与在全维度上操作的常见贸易分类器相比,我们表明通过计算数据的子空间可以产生更准确的模型。同时,由于跨步数据类的样本分布不均匀,可以减少结果模型中的问题偏差。我们提供并讨论了实验结果,以支持两个实验设置的两个假设。
{"title":"Classifying Salsa dance steps from skeletal poses","authors":"Sotiris Karavarsamis, D. Ververidis, G. Chantas, S. Nikolopoulos, Y. Kompatsiaris","doi":"10.1109/CBMI.2016.7500244","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500244","url":null,"abstract":"In this paper, we explore building classifiers to detect Salsa dance step primitives in choreographies available in the Huawei 3DLife data set. These can collectively be an important component of dance tuition systems that support e-learning. A dance step is reasoned as the shortest possible extract of bodily motion that can uniquely identify a particularly repeatable movement through time. The representation of dance steps adopted is a concatenation of vectorized matrices involving the 3D coordinates of tracked body joints. Under this modeling context, a Salsa dance performance is seen as an ordered sequence of Salsa dance steps, requiring a multiple of the variables allocated in the representation of a single step. Following a previous work by Masurelle & Essid that discusses the classification of six Salsa dance steps from 3DLife, we show that it is possible to obtain better classifiers under a similar experimental protocol in terms of both test accuracy and F-measure. By carefully re-annotating the data in 3DLife, we refocus on the six-step classification problem and then extend the protocol to the case of 20 dance steps. In comparison to common classifiers of the trade operating on full-dimensions, we show that it is possible to produce more accurate models by computing a subspace of the data. At the same time it is possible to reduce problematic bias in resulting models due to the uneven distribution of samples across step data classes. We provide and discuss experimental findings to support both hypotheses for the two experimental settings.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124102479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1