对象级语义视频内容注释

Advances in Mobile Multimedia Pub Date : 2012-12-03 DOI:10.1145/2428955.2428991

Vanessa El-Khoury, Martin Jergler, David Coquil, H. Kosch

{"title":"对象级语义视频内容注释","authors":"Vanessa El-Khoury, Martin Jergler, David Coquil, H. Kosch","doi":"10.1145/2428955.2428991","DOIUrl":null,"url":null,"abstract":"A vital prerequisite for fine-grained video content processing (indexing, querying, retrieval, adaptation, etc.) is the production of accurate metadata describing its structure and semantics. Several annotation tools were presented in the literature generating metadata at different granularities (i.e. scenes, shots, frames, objects). These tools have a number of limitations with respect to the annotation of objects. Though they provide functionalities to localize and annotate an object in a frame, the propagation of this information in the next frames still requires human intervention. Furthermore, they are based on video models that lack expressiveness along the spatial and semantic dimensions. To address these shortcomings, we propose the Semantic Video Content Annotation Tool (SVCAT) for structural and high-level semantic annotation. SVCAT is a semi-automatic annotation tool compliant with the MPEG-7 standard, which produces metadata according to an object-based video content model described in this paper. In particular, the novelty of SVCAT lies in its automatic propagation of the object localization and description metadata realized by tracking their contour through the video, thus drastically alleviating the task of the annotator. Experimental results show that SVCAT provides accurate metadata to object-based applications, particularly exact contours of multiple deformable objects.","PeriodicalId":135195,"journal":{"name":"Advances in Mobile Multimedia","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Semantic video content annotation at the object level\",\"authors\":\"Vanessa El-Khoury, Martin Jergler, David Coquil, H. Kosch\",\"doi\":\"10.1145/2428955.2428991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A vital prerequisite for fine-grained video content processing (indexing, querying, retrieval, adaptation, etc.) is the production of accurate metadata describing its structure and semantics. Several annotation tools were presented in the literature generating metadata at different granularities (i.e. scenes, shots, frames, objects). These tools have a number of limitations with respect to the annotation of objects. Though they provide functionalities to localize and annotate an object in a frame, the propagation of this information in the next frames still requires human intervention. Furthermore, they are based on video models that lack expressiveness along the spatial and semantic dimensions. To address these shortcomings, we propose the Semantic Video Content Annotation Tool (SVCAT) for structural and high-level semantic annotation. SVCAT is a semi-automatic annotation tool compliant with the MPEG-7 standard, which produces metadata according to an object-based video content model described in this paper. In particular, the novelty of SVCAT lies in its automatic propagation of the object localization and description metadata realized by tracking their contour through the video, thus drastically alleviating the task of the annotator. Experimental results show that SVCAT provides accurate metadata to object-based applications, particularly exact contours of multiple deformable objects.\",\"PeriodicalId\":135195,\"journal\":{\"name\":\"Advances in Mobile Multimedia\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Mobile Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2428955.2428991\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Mobile Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2428955.2428991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

细粒度视频内容处理(索引、查询、检索、改编等)的一个重要先决条件是生成描述其结构和语义的准确元数据。文献中提出了几种标注工具，用于生成不同粒度(即场景、镜头、帧、对象)的元数据。这些工具在对象注释方面有许多限制。尽管它们提供了在框架中对对象进行本地化和注释的功能，但是在下一个框架中传播这些信息仍然需要人工干预。此外，它们基于视频模型，缺乏空间和语义维度的表达能力。为了解决这些问题，我们提出了语义视频内容注释工具(SVCAT)，用于结构和高级语义注释。SVCAT是一种符合MPEG-7标准的半自动标注工具，它根据本文描述的基于对象的视频内容模型生成元数据。特别是，SVCAT的新颖之处在于，它通过视频跟踪物体的轮廓来实现物体定位和描述元数据的自动传播，从而大大减轻了注释者的任务。实验结果表明，SVCAT为基于对象的应用提供了准确的元数据，特别是多个可变形对象的精确轮廓。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semantic video content annotation at the object level

A vital prerequisite for fine-grained video content processing (indexing, querying, retrieval, adaptation, etc.) is the production of accurate metadata describing its structure and semantics. Several annotation tools were presented in the literature generating metadata at different granularities (i.e. scenes, shots, frames, objects). These tools have a number of limitations with respect to the annotation of objects. Though they provide functionalities to localize and annotate an object in a frame, the propagation of this information in the next frames still requires human intervention. Furthermore, they are based on video models that lack expressiveness along the spatial and semantic dimensions. To address these shortcomings, we propose the Semantic Video Content Annotation Tool (SVCAT) for structural and high-level semantic annotation. SVCAT is a semi-automatic annotation tool compliant with the MPEG-7 standard, which produces metadata according to an object-based video content model described in this paper. In particular, the novelty of SVCAT lies in its automatic propagation of the object localization and description metadata realized by tracking their contour through the video, thus drastically alleviating the task of the annotator. Experimental results show that SVCAT provides accurate metadata to object-based applications, particularly exact contours of multiple deformable objects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Advances in Mobile Multimedia

自引率

0.00%

发文量