首页 > 最新文献

IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.最新文献

英文 中文
Trends of learning technology standard 学习技术标准的趋势
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237803
K. Nakabayashi
In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.
建立学习内容、学习者信息和学习系统组件的互操作性是促进计算机化教育和培训的关键。在美国、欧洲和亚洲,政府、行业和学术界都在关注并努力朝这个方向发展。一些学习技术标准化计划正在开发涵盖相当大领域的规范,例如平台、多媒体数据、学习内容、学习者信息和能力定义。本文讨论了学习技术标准的需求,总结了每个计划中的工作,并描述了标准化工作的未来方向。
{"title":"Trends of learning technology standard","authors":"K. Nakabayashi","doi":"10.1109/ICME.2001.1237803","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237803","url":null,"abstract":"In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127077634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hierarchical graph model for probing multimedia applications 用于探测多媒体应用程序的层次图模型
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237725
Baochun Li
In order to achieve the best application-level Quality-of-Service (QoS), complex multimedia applications need to be dynamically tuned and reconfigured to adapt to unpredictable open environments offered by general-purpose systems. We believe that the objective of such adaptations should be to maintain a stable QoS with respect to a set of critical application QoS parameters. However, we have observed that only a limited set of parameters may be used as “tuning knobs” to affect the application behavior. In this paper, we present a hierarchical graph model to discover the relationships between the sets of tunable and critical QoS parameters. Based on such a model, we propose a polynomialcomplexity QoS probing algorithm to quantitatively capture the run-time relationships between the two sets of parameters. Our probing algorithm is integrated into our broader framework, Agilos, which uses a configurable visual tracking application to verify the effectiveness of adaptations.
为了实现最佳的应用程序级服务质量(QoS),需要对复杂的多媒体应用程序进行动态调优和重新配置,以适应通用系统提供的不可预测的开放环境。我们认为,这种调整的目标应该是在一组关键的应用QoS参数方面保持稳定的QoS。然而,我们已经观察到,只有一组有限的参数可以用作“调优旋钮”来影响应用程序的行为。在本文中,我们提出了一个层次图模型来发现可调和关键QoS参数集之间的关系。基于此模型,我们提出了一种多项式复杂度QoS探测算法,以定量捕获两组参数之间的运行时关系。我们的探测算法集成到我们更广泛的框架Agilos中,它使用一个可配置的视觉跟踪应用程序来验证适应性的有效性。
{"title":"A hierarchical graph model for probing multimedia applications","authors":"Baochun Li","doi":"10.1109/ICME.2001.1237725","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237725","url":null,"abstract":"In order to achieve the best application-level Quality-of-Service (QoS), complex multimedia applications need to be dynamically tuned and reconfigured to adapt to unpredictable open environments offered by general-purpose systems. We believe that the objective of such adaptations should be to maintain a stable QoS with respect to a set of critical application QoS parameters. However, we have observed that only a limited set of parameters may be used as “tuning knobs” to affect the application behavior. In this paper, we present a hierarchical graph model to discover the relationships between the sets of tunable and critical QoS parameters. Based on such a model, we propose a polynomialcomplexity QoS probing algorithm to quantitatively capture the run-time relationships between the two sets of parameters. Our probing algorithm is integrated into our broader framework, Agilos, which uses a configurable visual tracking application to verify the effectiveness of adaptations.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124159683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast full search based block matching algorithm from fast kick-off of impossible candidate checking points 基于不可能候选检查点快速启动的快速全搜索块匹配算法
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237815
Jong-Nam Kim, Sung-Cheal Byun, Byung-Ha Ahn
To reduce an amount of computation of full search (FS) algorithm for fast motion estimation, we propose a new and fast matching algorithm without any degradation of predicted images like the conventional FS. The computational reduction without any degradation in predicted image comes from fast kick-off of impossible motion vectors. The fast kick-off of improper motion vectors comes from sequential rejection with derived formula and subblock norms. The sequential rejection of impossible candidates is based on multiple decision boundaries. Our proposed algorithm reduces more the computations than the recent fast full search (FS) motion estimation algorithms.
为了减少全搜索(FS)算法在快速运动估计中的计算量,我们提出了一种新的快速匹配算法,而不像传统的FS那样对预测图像有任何退化。对不可能的运动向量进行快速启动,使预测图像的计算量减少而不出现任何退化。通过推导出的公式和子块范数进行序贯抑制,可以快速启动异常运动向量。顺序拒绝不可能的候选人是基于多个决策边界。该算法比现有的快速全搜索运动估计算法减少了更多的计算量。
{"title":"Fast full search based block matching algorithm from fast kick-off of impossible candidate checking points","authors":"Jong-Nam Kim, Sung-Cheal Byun, Byung-Ha Ahn","doi":"10.1109/ICME.2001.1237815","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237815","url":null,"abstract":"To reduce an amount of computation of full search (FS) algorithm for fast motion estimation, we propose a new and fast matching algorithm without any degradation of predicted images like the conventional FS. The computational reduction without any degradation in predicted image comes from fast kick-off of impossible motion vectors. The fast kick-off of improper motion vectors comes from sequential rejection with derived formula and subblock norms. The sequential rejection of impossible candidates is based on multiple decision boundaries. Our proposed algorithm reduces more the computations than the recent fast full search (FS) motion estimation algorithms.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122217796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Content-based music retrieval using linear scaling and branch-and-bound tree search 基于内容的音乐检索,使用线性缩放和分支绑定树搜索
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237713
J. Jang, Hong-Ru Lee, M. Kao
paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.
本文介绍了在基于内容的音乐检索系统中使用线性缩放和树搜索,该系统可以通过麦克风获取用户的声学输入(8秒的唱歌或哼唱片段),然后从数据库中的3000多首候选歌曲中检索预期的歌曲。该系统被称为Super MBox,证明了基于内容的实时音乐检索具有高识别率的可行性。Super MBox首先从麦克风获取用户的声学输入,并将其转换为音调矢量。然后利用线性缩放和树搜索的快速比较引擎计算相似度分数。我们对Super MBox进行了测试,发现前20名的识别率约为73%,其中有大约1000个测试输入,来自唱歌技能一般的人。
{"title":"Content-based music retrieval using linear scaling and branch-and-bound tree search","authors":"J. Jang, Hong-Ru Lee, M. Kao","doi":"10.1109/ICME.2001.1237713","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237713","url":null,"abstract":"paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131273987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Recovery of motion vectors by detecting homogeneous movements for H.263 video communications 通过检测H.263视频通信中的均匀运动来恢复运动矢量
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237648
Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee
In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.
本文提出了一种通过对相邻块的运动均匀性进行分类来恢复丢失或错误的运动矢量的新方法。根据MV的方向对相邻块的MV进行分类,并确定每一类的代表值,从而选择失真最小的候选MV。实验结果表明,该算法在许多情况下都比现有方法具有更好的性能。
{"title":"Recovery of motion vectors by detecting homogeneous movements for H.263 video communications","authors":"Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee","doi":"10.1109/ICME.2001.1237648","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237648","url":null,"abstract":"In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122987376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Toy interface for multimodal interaction and communication 玩具接口用于多模态交互和通信
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237805
K. Mase
Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.
玩具接口是一个面向现实世界的接口,它使用具有“玩具”形状和属性的建模对象作为现实世界和网络空间之间的接口。玩具界面可分为以下三种类型:娃娃型、盆景型和砖型。我们研究了各种玩具接口,并提出了一个娃娃型接口原型的设计细节,以实现多模态交互和通信。
{"title":"Toy interface for multimodal interaction and communication","authors":"K. Mase","doi":"10.1109/ICME.2001.1237805","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237805","url":null,"abstract":"Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124157400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wayfinding and navigation in haptic virtual environments 触觉虚拟环境中的寻路和导航
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237781
S. Semwal
Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.
认知地图是空间环境中相对位置和属性现象的心理模型。形成认知地图的能力是大自然与生俱来的天赋之一。缺乏这种能力会造成严重的后果,例如对视力受损的人。触觉成为视障人士形成认知地图的主要来源。认知地图一旦形成,就能提供物理世界的精确地图,这样视障人士就能在最少的帮助下成功导航。然而,传统的移动性训练非常耗时,而且对于盲人来说,在训练结束后形成的认知地图很难表达或重访。该触觉环境将允许视障人士将认知地图表达为3D表面地图,并由两个PHANToM力反馈设备引导他们。看护人可以对3D图像进行微调,然后视障人士可以再次感知,从而形成精确的认知地图。除了语音解说外,盲人熟悉的预先存在的形状库将在导航过程中提供方向和本体感觉触觉提示。认知地图的图形显示将向护理人员或培训师提供反馈。由于触觉环境可以很容易地存储和检索,MoVE系统还将鼓励盲人在自己方便的时候和家人一起导航。
{"title":"Wayfinding and navigation in haptic virtual environments","authors":"S. Semwal","doi":"10.1109/ICME.2001.1237781","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237781","url":null,"abstract":"Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128990199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multimedia materials for teaching signal processing 用于信号处理教学的多媒体材料
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237885
X. Huang, G. Woolsey
Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.
基于个人电脑的多媒体技术迅速发展的能力为教育材料的传送提供了新的机会。多媒体技术正在新英格兰大学电子和通信专业的所有级别课程中引入。本文以信号处理四年级教学包为例,说明多媒体技术的应用。我们在正式的课堂教学中使用多媒体教学包,并鼓励学生在自己的个人学习和项目中使用该技术,以提高他们的工程通用技能。合资企业的成功鼓励我们将该技术扩展到UNE工程项目中的其他选定单元。
{"title":"Multimedia materials for teaching signal processing","authors":"X. Huang, G. Woolsey","doi":"10.1109/ICME.2001.1237885","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237885","url":null,"abstract":"Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129399092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic based retrieval model for digital audio and video 基于语义的数字音视频检索模型
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237924
S. Nepal, Uma Srinivasan, G. Reynolds
Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as "bright color" and "very loud sound". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.
最近的基于内容的检索系统,如QBIC[7]和VisualSEEk[8]使用低级的视听特征,如颜色、平移、缩放和响度进行检索。然而,用户更喜欢使用基于他们感知的高级语义来检索视频,例如“明亮的颜色”和“非常响亮的声音”。这导致了用户想要的和系统能生成的之间的差距。本文试图通过将用户的感知(语义概念)映射到低级特征值来弥合这一差距。本文提出了一个模型,为决定响度的音频特征提供高级语义。我们首先进行了一个试点用户研究,以捕捉用户对声音效果的音频剪辑集合的响度水平的感知,并将它们映射到五个不同的语义术语。然后,我们描述了MPEG-1第二层音频文件中的响度测量如何映射到用户感知的响度。然后,我们设计了一种模糊技术,用于使用这些语义术语从集合中检索音频/视频剪辑。
{"title":"Semantic based retrieval model for digital audio and video","authors":"S. Nepal, Uma Srinivasan, G. Reynolds","doi":"10.1109/ICME.2001.1237924","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237924","url":null,"abstract":"Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as \"bright color\" and \"very loud sound\". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115725735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic caption localization in videos using salient points 在视频中使用突出点自动定位字幕
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237657
M. Bertini, C. Colombo, A. Bimbo
Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.
广播公司正表现出对建立其资产的数字档案的兴趣,以便在电视节目、在线可用性和存档中重新使用档案材料。这就需要视频索引和检索工具,通过利用高级视频信息(如包含在叠加文本标题中的信息)进行内容检索。本文提出了一种利用视频帧中突出点的时空局部属性来自动检测和定位数字视频中的字幕的方法。给出并讨论了在高分辨率dv序列和标准VHS视频上的实验结果。
{"title":"Automatic caption localization in videos using salient points","authors":"M. Bertini, C. Colombo, A. Bimbo","doi":"10.1109/ICME.2001.1237657","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237657","url":null,"abstract":"Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125552656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
期刊
IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1