基于语义的数字音视频检索模型

S. Nepal, Uma Srinivasan, G. Reynolds
{"title":"基于语义的数字音视频检索模型","authors":"S. Nepal, Uma Srinivasan, G. Reynolds","doi":"10.1109/ICME.2001.1237924","DOIUrl":null,"url":null,"abstract":"Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as \"bright color\" and \"very loud sound\". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Semantic based retrieval model for digital audio and video\",\"authors\":\"S. Nepal, Uma Srinivasan, G. Reynolds\",\"doi\":\"10.1109/ICME.2001.1237924\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as \\\"bright color\\\" and \\\"very loud sound\\\". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.\",\"PeriodicalId\":405589,\"journal\":{\"name\":\"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2001.1237924\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2001.1237924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

最近的基于内容的检索系统,如QBIC[7]和VisualSEEk[8]使用低级的视听特征,如颜色、平移、缩放和响度进行检索。然而,用户更喜欢使用基于他们感知的高级语义来检索视频,例如“明亮的颜色”和“非常响亮的声音”。这导致了用户想要的和系统能生成的之间的差距。本文试图通过将用户的感知(语义概念)映射到低级特征值来弥合这一差距。本文提出了一个模型,为决定响度的音频特征提供高级语义。我们首先进行了一个试点用户研究,以捕捉用户对声音效果的音频剪辑集合的响度水平的感知,并将它们映射到五个不同的语义术语。然后,我们描述了MPEG-1第二层音频文件中的响度测量如何映射到用户感知的响度。然后,我们设计了一种模糊技术,用于使用这些语义术语从集合中检索音频/视频剪辑。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Semantic based retrieval model for digital audio and video
Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as "bright color" and "very loud sound". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The ITEA project EUROPA, a software platform for digital CE appliances Speech bandwidth extension A music similarity function based on signal analysis A beat-pattern based error concealment scheme for music delivery with burst packet loss Analysis of cache efficiency in 2D wavelet transform
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1