首页 > 最新文献

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文 中文
Automatic illustration with cross-media retrieval in large-scale collections 自动插图与跨媒体检索大规模集合
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972515
Filipe Coelho, Cristina Ribeiro
In this paper, we approach the task of finding suitable images to illustrate text, from specific news stories to more generic blog entries. We have developed an automatic illustration system supported by multimedia information retrieval, that analyzes text and presents a list of candidate images to illustrate it. The system was tested on the SAPO-Labs media collection, containing almost two million images with short descriptions, and the MIRFlickr-25000 collection, with photos and user tags from Flickr. Visual content is described by the Joint Composite Descriptor and indexed by a Permutation-Prefix Index. Illustration is a three-stage process using textual search, score filtering and visual clustering. A preliminary evaluation using exhaustive and approximate visual searches demonstrates the capabilities of the visual descriptor and approximate indexing scheme used.
在本文中,我们的任务是寻找合适的图像来说明文本,从特定的新闻故事到更通用的博客条目。我们开发了一个多媒体信息检索支持的自动插图系统,该系统可以对文本进行分析,并提供一个候选图像列表来进行插图。该系统在SAPO-Labs的媒体集合上进行了测试,其中包含近200万张带有简短描述的图像,以及MIRFlickr-25000的集合,其中包含来自Flickr的照片和用户标签。可视化内容由联合复合描述符描述,并由置换前缀索引索引。插图是一个使用文本搜索、分数过滤和视觉聚类的三个阶段的过程。使用详尽和近似可视化搜索的初步评估演示了所使用的可视化描述符和近似索引方案的功能。
{"title":"Automatic illustration with cross-media retrieval in large-scale collections","authors":"Filipe Coelho, Cristina Ribeiro","doi":"10.1109/CBMI.2011.5972515","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972515","url":null,"abstract":"In this paper, we approach the task of finding suitable images to illustrate text, from specific news stories to more generic blog entries. We have developed an automatic illustration system supported by multimedia information retrieval, that analyzes text and presents a list of candidate images to illustrate it. The system was tested on the SAPO-Labs media collection, containing almost two million images with short descriptions, and the MIRFlickr-25000 collection, with photos and user tags from Flickr. Visual content is described by the Joint Composite Descriptor and indexed by a Permutation-Prefix Index. Illustration is a three-stage process using textual search, score filtering and visual clustering. A preliminary evaluation using exhaustive and approximate visual searches demonstrates the capabilities of the visual descriptor and approximate indexing scheme used.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114799472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Audiovisual video context recognition using SVM and genetic algorithm fusion rule weighting 视听视频上下文识别采用支持向量机和遗传算法融合规则加权
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972541
Mikko Roininen, E. Guldogan, M. Gabbouj
The recognition of the surrounding context from video recordings offers interesting possibilities for context awareness of video capable mobile devices. Multimodal analysis provides means for improved recognition accuracy and robustness in different use conditions. We present a mul-timodal video context recognition system fusing audio and video cues with support vector machines (SVM) and simple rules with genetic algorithm (GA) optimized weights. Mul-timodal recognition is shown to outperform the unimodal approaches in recognizing between 21 everyday contexts. The highest correct classification rate of 0.844 is achieved with SVM-based fusion.
从视频记录中识别周围环境为具有视频功能的移动设备的上下文感知提供了有趣的可能性。多模态分析提供了在不同使用条件下提高识别精度和鲁棒性的手段。提出了一种多模态视频上下文识别系统,该系统将音频和视频线索与支持向量机(SVM)和遗传算法(GA)优化权值的简单规则融合在一起。多模态识别在21个日常环境之间的识别表现优于单模态方法。基于svm融合的分类正确率最高,为0.844。
{"title":"Audiovisual video context recognition using SVM and genetic algorithm fusion rule weighting","authors":"Mikko Roininen, E. Guldogan, M. Gabbouj","doi":"10.1109/CBMI.2011.5972541","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972541","url":null,"abstract":"The recognition of the surrounding context from video recordings offers interesting possibilities for context awareness of video capable mobile devices. Multimodal analysis provides means for improved recognition accuracy and robustness in different use conditions. We present a mul-timodal video context recognition system fusing audio and video cues with support vector machines (SVM) and simple rules with genetic algorithm (GA) optimized weights. Mul-timodal recognition is shown to outperform the unimodal approaches in recognizing between 21 everyday contexts. The highest correct classification rate of 0.844 is achieved with SVM-based fusion.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116368509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Audio similarity matrices enhancement in an image processing framework 图像处理框架中的音频相似矩阵增强
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972522
Florian Kaiser, Marina Georgia Arvanitidou, T. Sikora
Audio similarity matrices have become a popular tool in the MIR community for their ability to reveal segments of high acoustical self-similarity and repetitive patterns. This is particularly useful for the task of music structure segmentation. The performance of such systems however relies on the nature of the studied music pieces and it is often assumed that harmonic and timbre variations remain low within musical sections. While this condition is rarely fulfilled, similarity matrices are often too complex and structural information can hardly be extracted. In this paper we propose an image-oriented pre-processing of similarity matrices to highlight the conveyed musical information and reduce their complexity. The image segmentation processing step handles the image characteristics in order to provide us meaningful spatial segments and enhance thus the music segmentation. Evaluation of a reference structure segmentation algorithm using the enhanced matrices is provided, and we show that our method strongly improves the segmentation performances.
音频相似矩阵已经成为MIR社区中流行的工具,因为它们能够揭示高声学自相似性和重复模式的片段。这对于音乐结构分割任务特别有用。然而,这种系统的表现依赖于所研究音乐作品的性质,并且通常假设在音乐部分中和声和音色变化仍然很低。然而这个条件很少满足,相似矩阵往往过于复杂,难以提取结构信息。本文提出了一种面向图像的相似性矩阵预处理方法,以突出所传递的音乐信息,降低相似性矩阵的复杂度。图像分割处理步骤对图像特征进行处理,为我们提供有意义的空间片段,从而增强音乐分割效果。对一种基于增强矩阵的参考结构分割算法进行了评价,结果表明该算法显著提高了分割性能。
{"title":"Audio similarity matrices enhancement in an image processing framework","authors":"Florian Kaiser, Marina Georgia Arvanitidou, T. Sikora","doi":"10.1109/CBMI.2011.5972522","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972522","url":null,"abstract":"Audio similarity matrices have become a popular tool in the MIR community for their ability to reveal segments of high acoustical self-similarity and repetitive patterns. This is particularly useful for the task of music structure segmentation. The performance of such systems however relies on the nature of the studied music pieces and it is often assumed that harmonic and timbre variations remain low within musical sections. While this condition is rarely fulfilled, similarity matrices are often too complex and structural information can hardly be extracted. In this paper we propose an image-oriented pre-processing of similarity matrices to highlight the conveyed musical information and reduce their complexity. The image segmentation processing step handles the image characteristics in order to provide us meaningful spatial segments and enhance thus the music segmentation. Evaluation of a reference structure segmentation algorithm using the enhanced matrices is provided, and we show that our method strongly improves the segmentation performances.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125458420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A semantic-based and adaptive architecture for automatic multimedia retrieval composition 一种基于语义的自适应多媒体检索组合体系结构
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972542
D. Giordano, I. Kavasidis, C. Pino, C. Spampinato
In this paper we present a domain-independent multimedia retrieval (MMR) platform. Currently, the use of MMR systems for different domains poses several limitations, mainly related to the poor flexibility and adaptability to different domains and user requirements. A semantic-based platform that uses ontologies for describing not only the application domain but also the processing workflow to be followed for the retrieval, according to user's requirements and domain characteristics is here proposed. In detail, an ontological model (domain-processing ontology) that integrates domain peculiarities and processing algorithms allows self-adaptation of the retrieval mechanism to the specified application domain. According to the instances generated for each user request, our platform generates the appropriate interface (GUI) for the specified application domain (e.g. music, sport video, medical images, etc…) by a procedure guided by the defined domain-processing ontology. A use case on content based music retrieval is here presented in order to show how the proposed platform also facilitates the process of multimedia retrieval system implementation.
本文提出了一个领域无关的多媒体检索平台。目前,MMR系统在不同领域的应用存在一些局限性,主要表现在对不同领域和用户需求的灵活性和适应性较差。提出了一种基于语义的平台,根据用户需求和领域特点,利用本体对应用领域进行描述,并根据本体描述检索过程中需要遵循的处理流程。具体来说,集成了领域特性和处理算法的本体模型(领域处理本体)允许检索机制自适应于指定的应用领域。根据每个用户请求生成的实例,我们的平台通过定义的领域处理本体指导的过程,为指定的应用领域(例如音乐、体育视频、医学图像等)生成适当的界面(GUI)。本文给出了一个基于内容的音乐检索用例,以展示所提出的平台如何促进多媒体检索系统实现的过程。
{"title":"A semantic-based and adaptive architecture for automatic multimedia retrieval composition","authors":"D. Giordano, I. Kavasidis, C. Pino, C. Spampinato","doi":"10.1109/CBMI.2011.5972542","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972542","url":null,"abstract":"In this paper we present a domain-independent multimedia retrieval (MMR) platform. Currently, the use of MMR systems for different domains poses several limitations, mainly related to the poor flexibility and adaptability to different domains and user requirements. A semantic-based platform that uses ontologies for describing not only the application domain but also the processing workflow to be followed for the retrieval, according to user's requirements and domain characteristics is here proposed. In detail, an ontological model (domain-processing ontology) that integrates domain peculiarities and processing algorithms allows self-adaptation of the retrieval mechanism to the specified application domain. According to the instances generated for each user request, our platform generates the appropriate interface (GUI) for the specified application domain (e.g. music, sport video, medical images, etc…) by a procedure guided by the defined domain-processing ontology. A use case on content based music retrieval is here presented in order to show how the proposed platform also facilitates the process of multimedia retrieval system implementation.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125928518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
High-level event detection in video exploiting discriminant concepts 基于判别概念的视频高级事件检测
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972525
Nikolaos Gkalelis, V. Mezaris, Y. Kompatsiaris
In this paper a new approach to video event detection is presented, combining visual concept detection scores with a new dimensionality reduction technique. Specifically, a video is first decomposed to a sequence of shots, and trained visual concept detectors are used to represent video content with model vector sequences. Subsequently, an improved subclass discriminant analysis method is used to derive a concept subspace for detecting and recognizing high-level events. In this space, the median Hausdorff distance is used to implicitly align and compare event videos of different lengths, and the nearest neighbor rule is used for recognizing the event depicted in the video. Evaluation results obtained by our participation in the Multimedia Event Detection Task of the TRECVID 2010 competition verify the effectiveness of the proposed approach for event detection and recognition in large scale video collections.
本文提出了一种新的视频事件检测方法,将视觉概念检测分数与一种新的降维技术相结合。具体来说,首先将视频分解为一系列镜头,然后使用训练好的视觉概念检测器用模型向量序列表示视频内容。随后,采用改进的子类判别分析方法,推导出用于高级事件检测和识别的概念子空间。在这个空间中,使用中位数Hausdorff距离隐式对齐和比较不同长度的事件视频,使用最近邻规则来识别视频中描述的事件。通过参与TRECVID 2010竞赛的多媒体事件检测任务获得的评估结果验证了所提出的方法在大规模视频集合中进行事件检测和识别的有效性。
{"title":"High-level event detection in video exploiting discriminant concepts","authors":"Nikolaos Gkalelis, V. Mezaris, Y. Kompatsiaris","doi":"10.1109/CBMI.2011.5972525","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972525","url":null,"abstract":"In this paper a new approach to video event detection is presented, combining visual concept detection scores with a new dimensionality reduction technique. Specifically, a video is first decomposed to a sequence of shots, and trained visual concept detectors are used to represent video content with model vector sequences. Subsequently, an improved subclass discriminant analysis method is used to derive a concept subspace for detecting and recognizing high-level events. In this space, the median Hausdorff distance is used to implicitly align and compare event videos of different lengths, and the nearest neighbor rule is used for recognizing the event depicted in the video. Evaluation results obtained by our participation in the Multimedia Event Detection Task of the TRECVID 2010 competition verify the effectiveness of the proposed approach for event detection and recognition in large scale video collections.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131621262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Towards ontologies for image interpretation and annotation 面向图像解释和注释的本体
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972547
Hichem Bannour, C. Hudelot
Due to the well-known semantic gap problem, a wide number of approaches have been proposed during the last decade for automatic image annotation, i.e. the textual description of images. Since these approaches are still not sufficiently efficient, a new trend is to use semantic hierarchies of concepts or ontologies to improve the image annotation process. This paper presents an overview and an analysis of the use of semantic hierarchies and ontologies to provide a deeper image understanding and a better image annotation in order to furnish retrieval facilities to users.
由于众所周知的语义缺口问题,在过去的十年中,人们提出了大量的自动图像标注方法,即图像的文本描述。由于这些方法仍然不够有效,新的趋势是使用概念或本体的语义层次来改进图像标注过程。本文概述和分析了语义层次和本体的使用,以提供更深入的图像理解和更好的图像注释,以便为用户提供检索工具。
{"title":"Towards ontologies for image interpretation and annotation","authors":"Hichem Bannour, C. Hudelot","doi":"10.1109/CBMI.2011.5972547","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972547","url":null,"abstract":"Due to the well-known semantic gap problem, a wide number of approaches have been proposed during the last decade for automatic image annotation, i.e. the textual description of images. Since these approaches are still not sufficiently efficient, a new trend is to use semantic hierarchies of concepts or ontologies to improve the image annotation process. This paper presents an overview and an analysis of the use of semantic hierarchies and ontologies to provide a deeper image understanding and a better image annotation in order to furnish retrieval facilities to users.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131502083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Saliency-aware color moments features for image categorization and retrieval 用于图像分类和检索的显著性感知颜色矩特征
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972545
Miriam Redi, B. Mérialdo
Traditional window-based color indexing techniques have been widely used in image analysis and retrieval systems. In the existing approaches, all the image regions are treated with equal importance. However, some image areas carry more information about their content (e.g. the scene foreground). The human visual system bases indeed the categorization process on such set of perceptually salient region. Therefore, in order to improve the discriminative abilities of the color features for image recognition, higher importance should be given to the chromatic characteristics of more informative windows. In this paper, we present an informativeness-aware color descriptor based on the Color Moments feature [17]. We first define a saliency-based measure to quantify the amount of information carried by each image window; we then change the window-based CM feature according to the computed local informativeness. Finally, we show that this new hybrid feature outperforms the traditional Color Moments in a variety of challenging dataset for scene categorization, object recognition and video retrieval.
传统的基于窗口的颜色索引技术在图像分析和检索系统中得到了广泛的应用。在现有的方法中,所有的图像区域都是同等重要的。然而,一些图像区域携带更多关于其内容的信息(例如场景前景)。人类视觉系统的分类过程确实建立在这样一组感知显著区域的基础上。因此,为了提高图像识别中颜色特征的判别能力,应该更加重视信息量更大的窗口的颜色特征。在本文中,我们提出了一种基于颜色矩特征的信息感知颜色描述符[17]。我们首先定义了一个基于显著性的度量来量化每个图像窗口所携带的信息量;然后根据计算的局部信息量改变基于窗口的CM特征。最后,我们证明了这种新的混合特征在各种具有挑战性的数据集中优于传统的颜色矩,用于场景分类,目标识别和视频检索。
{"title":"Saliency-aware color moments features for image categorization and retrieval","authors":"Miriam Redi, B. Mérialdo","doi":"10.1109/CBMI.2011.5972545","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972545","url":null,"abstract":"Traditional window-based color indexing techniques have been widely used in image analysis and retrieval systems. In the existing approaches, all the image regions are treated with equal importance. However, some image areas carry more information about their content (e.g. the scene foreground). The human visual system bases indeed the categorization process on such set of perceptually salient region. Therefore, in order to improve the discriminative abilities of the color features for image recognition, higher importance should be given to the chromatic characteristics of more informative windows. In this paper, we present an informativeness-aware color descriptor based on the Color Moments feature [17]. We first define a saliency-based measure to quantify the amount of information carried by each image window; we then change the window-based CM feature according to the computed local informativeness. Finally, we show that this new hybrid feature outperforms the traditional Color Moments in a variety of challenging dataset for scene categorization, object recognition and video retrieval.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128818785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Cross-site combination and evaluation of subword spoken term detection systems 子词口语词检测系统的跨站点组合与评价
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972521
Timo Mertens, R. Wallace, Daniel Schneider
The design and evaluation of subword-based spoken term detection (STD) systems depends on various factors, such as language, type of the speech to be searched and application scenario. The choice of the subword unit and search approach, however, is oftentimes made regardless of these factors. Therefore, we evaluate two subword STD systems across two data sets with varying properties to investigate the influence of different subword units on STD performance when working with different data types. Results show that on German broadcast news data, constrained search in syllable lattices is effective, whereas fuzzy phone lattice search is superior in more challenging English conversational telephone speech. By combining the key features of the two systems at an early stage, we achieve improvements in Figure of Merit of up to 13.4% absolute on the German data. We also show that the choice of the appropriate evaluation metric is crucial when comparing retrieval performances across systems.
基于子词的口语术语检测系统的设计和评价取决于多种因素,如语言、要搜索的语音类型和应用场景。然而,子词单位和搜索方法的选择通常不考虑这些因素。因此,我们在两个具有不同属性的数据集上评估两个子词STD系统,以研究不同子词单元在处理不同数据类型时对STD性能的影响。结果表明,在德语广播新闻数据中,音节格约束搜索是有效的,而模糊电话格搜索在更具挑战性的英语会话电话语音中更优越。通过在早期阶段结合两个系统的主要特征,我们在德国数据上实现了高达13.4%的绝对改进。我们还表明,在比较跨系统的检索性能时,选择适当的评估度量是至关重要的。
{"title":"Cross-site combination and evaluation of subword spoken term detection systems","authors":"Timo Mertens, R. Wallace, Daniel Schneider","doi":"10.1109/CBMI.2011.5972521","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972521","url":null,"abstract":"The design and evaluation of subword-based spoken term detection (STD) systems depends on various factors, such as language, type of the speech to be searched and application scenario. The choice of the subword unit and search approach, however, is oftentimes made regardless of these factors. Therefore, we evaluate two subword STD systems across two data sets with varying properties to investigate the influence of different subword units on STD performance when working with different data types. Results show that on German broadcast news data, constrained search in syllable lattices is effective, whereas fuzzy phone lattice search is superior in more challenging English conversational telephone speech. By combining the key features of the two systems at an early stage, we achieve improvements in Figure of Merit of up to 13.4% absolute on the German data. We also show that the choice of the appropriate evaluation metric is crucial when comparing retrieval performances across systems.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133597897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Game, shot and match: Event-based indexing of tennis 比赛,击球和比赛:基于事件的网球索引
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972528
Damien Connaghan, Philip Kelly, N. O’Connor
Identifying events in sports video offers great potential for advancing visual sports coaching applications. In this paper, we present our results for detecting key events in a tennis match. Our overall goal is to automatically index a complete tennis match into all the main tennis events, so that a match can be recorded using affordable visual sensing equipment and then be automatically indexed into key events for retrieval and editing. The tennis events detected in this paper are a tennis game, a change of end and a tennis serve — all of which share temporal commonalities. There are of course other events in tennis which we aim to index in our overall indexing system, but this paper focuses solely on the aforementioned tennis events. This paper proposes a novel approach to detect key events in an instrumented tennis environment by analysing a players location and the visual features of a player.
识别运动视频中的事件为推进视觉运动指导应用提供了巨大的潜力。在本文中,我们给出了我们在网球比赛中关键事件的检测结果。我们的总体目标是将一场完整的网球比赛自动索引到所有主要的网球赛事中,这样就可以使用负担得起的视觉传感设备记录比赛,然后自动索引到关键事件中,以便检索和编辑。本文检测的网球事件是网球比赛、换球和发球,它们都具有时间共性。当然,我们的目标是在我们的整体索引系统中索引网球中的其他事件,但本文仅关注上述网球事件。本文提出了一种新的方法,通过分析球员的位置和球员的视觉特征来检测仪器网球环境中的关键事件。
{"title":"Game, shot and match: Event-based indexing of tennis","authors":"Damien Connaghan, Philip Kelly, N. O’Connor","doi":"10.1109/CBMI.2011.5972528","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972528","url":null,"abstract":"Identifying events in sports video offers great potential for advancing visual sports coaching applications. In this paper, we present our results for detecting key events in a tennis match. Our overall goal is to automatically index a complete tennis match into all the main tennis events, so that a match can be recorded using affordable visual sensing equipment and then be automatically indexed into key events for retrieval and editing. The tennis events detected in this paper are a tennis game, a change of end and a tennis serve — all of which share temporal commonalities. There are of course other events in tennis which we aim to index in our overall indexing system, but this paper focuses solely on the aforementioned tennis events. This paper proposes a novel approach to detect key events in an instrumented tennis environment by analysing a players location and the visual features of a player.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130960010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Activities of daily living indexing by hierarchical HMM for dementia diagnostics 用层次HMM对痴呆诊断的日常生活活动进行索引
Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972524
Svebor Karaman, J. Benois-Pineau, R. Mégret, J. Pinquier, Yann Gaëstel, J. Dartigues
This paper presents a method for indexing human activities in videos captured from a wearable camera being worn by patients, for studies of progression of the dementia diseases. Our method aims to produce indexes to facilitate the navigation throughout the individual video recordings, which could help doctors search for early signs of the disease in the activities of daily living. The recorded videos have strong motion and sharp lighting changes, inducing noise for the analysis. The proposed approach is based on a two steps analysis. First, we propose a new approach to segment this type of video, based on apparent motion. Each segment is characterized by two original motion descriptors, as well as color, and audio descriptors. Second, a Hidden-Markov Model formulation is used to merge the multimodal audio and video features, and classify the test segments. Experiments show the good properties of the approach on real data.
本文提出了一种方法,索引人类活动的视频从可穿戴相机被患者佩戴,用于痴呆疾病的进展研究。我们的方法旨在生成索引,以便于在整个个人视频记录中进行导航,这可以帮助医生在日常生活活动中寻找疾病的早期迹象。录制的视频具有强烈的运动和强烈的光线变化,为分析产生噪声。提出的方法基于两步分析。首先,我们提出了一种新的方法来分割这种类型的视频,基于表观运动。每个片段由两个原始运动描述符、颜色描述符和音频描述符表征。其次,利用隐马尔可夫模型对多模态音频和视频特征进行合并,并对测试片段进行分类。实验表明,该方法在实际数据上具有良好的性能。
{"title":"Activities of daily living indexing by hierarchical HMM for dementia diagnostics","authors":"Svebor Karaman, J. Benois-Pineau, R. Mégret, J. Pinquier, Yann Gaëstel, J. Dartigues","doi":"10.1109/CBMI.2011.5972524","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972524","url":null,"abstract":"This paper presents a method for indexing human activities in videos captured from a wearable camera being worn by patients, for studies of progression of the dementia diseases. Our method aims to produce indexes to facilitate the navigation throughout the individual video recordings, which could help doctors search for early signs of the disease in the activities of daily living. The recorded videos have strong motion and sharp lighting changes, inducing noise for the analysis. The proposed approach is based on a two steps analysis. First, we propose a new approach to segment this type of video, based on apparent motion. Each segment is characterized by two original motion descriptors, as well as color, and audio descriptors. Second, a Hidden-Markov Model formulation is used to merge the multimodal audio and video features, and classify the test segments. Experiments show the good properties of the approach on real data.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116246172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1