首页 > 最新文献

IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.最新文献

英文 中文
Study on the models of the collaborative learning systems and proposal to the standardization activities 协同学习系统模型研究及标准化活动建议
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237802
A. Koga
As a new learning method in global age, the collaborative learning gathers many interests. In collaborative learning, through the discussion among the learners about shared problems, they acquire the skills such as problem finding skills, problem solving skills, self-regulate skills and communication skills. In collaborative learning, the interactions between learners play an important role. To make it easier to design a learning environment where desirable interactions occur, it is necessary to have a description system to express the learning process explicitly. We describe that the description system enables us to develop the tools to design the learning process involving the interactions between learners, and to develop the tools that execute the designed process symbolically to show the feasibility of the process. Finally, we describe that the description system can be used as the basis of the collaborative activity record format standard that is currently being progressed in the collaborative learning technology WG in ISO/IEC JTC1/SC36.
协同学习作为全球化时代的一种新的学习方式,引起了人们的广泛关注。在协作学习中,学习者通过对共同问题的讨论,获得发现问题的能力、解决问题的能力、自我调节能力和沟通能力。在协作学习中,学习者之间的互动起着重要的作用。为了更容易地设计一个学习环境,在那里发生理想的交互,有必要有一个描述系统来明确地表达学习过程。我们描述了描述系统使我们能够开发工具来设计涉及学习者之间交互的学习过程,并开发工具来象征性地执行所设计的过程,以显示过程的可行性。最后,我们描述了描述系统可以用作协作活动记录格式标准的基础,该标准目前正在ISO/IEC JTC1/SC36的协作学习技术工作组中进行。
{"title":"Study on the models of the collaborative learning systems and proposal to the standardization activities","authors":"A. Koga","doi":"10.1109/ICME.2001.1237802","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237802","url":null,"abstract":"As a new learning method in global age, the collaborative learning gathers many interests. In collaborative learning, through the discussion among the learners about shared problems, they acquire the skills such as problem finding skills, problem solving skills, self-regulate skills and communication skills. In collaborative learning, the interactions between learners play an important role. To make it easier to design a learning environment where desirable interactions occur, it is necessary to have a description system to express the learning process explicitly. We describe that the description system enables us to develop the tools to design the learning process involving the interactions between learners, and to develop the tools that execute the designed process symbolically to show the feasibility of the process. Finally, we describe that the description system can be used as the basis of the collaborative activity record format standard that is currently being progressed in the collaborative learning technology WG in ISO/IEC JTC1/SC36.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129276683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trends of learning technology standard 学习技术标准的趋势
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237803
K. Nakabayashi
In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.
建立学习内容、学习者信息和学习系统组件的互操作性是促进计算机化教育和培训的关键。在美国、欧洲和亚洲,政府、行业和学术界都在关注并努力朝这个方向发展。一些学习技术标准化计划正在开发涵盖相当大领域的规范,例如平台、多媒体数据、学习内容、学习者信息和能力定义。本文讨论了学习技术标准的需求,总结了每个计划中的工作,并描述了标准化工作的未来方向。
{"title":"Trends of learning technology standard","authors":"K. Nakabayashi","doi":"10.1109/ICME.2001.1237803","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237803","url":null,"abstract":"In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127077634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A beat-pattern based error concealment scheme for music delivery with burst packet loss 一种基于节拍模式的突发丢包音乐传输错误隐藏方案
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237658
Ye-Kui Wang
Error concealment is an important method to mitigate the degradation of the audio quality when compressed audio packets are lost in error prone channels, such as mobile Internet and digital audio broadcasting. This paper presents a novel error concealment scheme, which exploits the beat and rhythmic pattern of music signals. Preliminary simulations show significantly improved subjective sound quality in comparison with conventional methods in the case of burst packet losses. The new scheme is proposed as a complement to prior arts. It can be adopted to essentially all existing perceptual audio decoders such as an MP3 decoder for streaming music.
在移动互联网和数字音频广播等易出错信道中,当压缩音频包丢失时,错误隐藏是缓解音频质量下降的一种重要方法。本文提出了一种利用音乐信号的节拍和节奏模式的错误隐藏方法。初步模拟结果表明,在突发包丢失情况下,与传统方法相比,主观音质有了显著提高。新方案是作为对现有技术的补充而提出的。它基本上可以用于所有现有的感知音频解码器,例如用于流媒体音乐的MP3解码器。
{"title":"A beat-pattern based error concealment scheme for music delivery with burst packet loss","authors":"Ye-Kui Wang","doi":"10.1109/ICME.2001.1237658","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237658","url":null,"abstract":"Error concealment is an important method to mitigate the degradation of the audio quality when compressed audio packets are lost in error prone channels, such as mobile Internet and digital audio broadcasting. This paper presents a novel error concealment scheme, which exploits the beat and rhythmic pattern of music signals. Preliminary simulations show significantly improved subjective sound quality in comparison with conventional methods in the case of burst packet losses. The new scheme is proposed as a complement to prior arts. It can be adopted to essentially all existing perceptual audio decoders such as an MP3 decoder for streaming music.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"9 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113964997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Content-based music retrieval using linear scaling and branch-and-bound tree search 基于内容的音乐检索,使用线性缩放和分支绑定树搜索
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237713
J. Jang, Hong-Ru Lee, M. Kao
paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.
本文介绍了在基于内容的音乐检索系统中使用线性缩放和树搜索,该系统可以通过麦克风获取用户的声学输入(8秒的唱歌或哼唱片段),然后从数据库中的3000多首候选歌曲中检索预期的歌曲。该系统被称为Super MBox,证明了基于内容的实时音乐检索具有高识别率的可行性。Super MBox首先从麦克风获取用户的声学输入,并将其转换为音调矢量。然后利用线性缩放和树搜索的快速比较引擎计算相似度分数。我们对Super MBox进行了测试,发现前20名的识别率约为73%,其中有大约1000个测试输入,来自唱歌技能一般的人。
{"title":"Content-based music retrieval using linear scaling and branch-and-bound tree search","authors":"J. Jang, Hong-Ru Lee, M. Kao","doi":"10.1109/ICME.2001.1237713","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237713","url":null,"abstract":"paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131273987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Recovery of motion vectors by detecting homogeneous movements for H.263 video communications 通过检测H.263视频通信中的均匀运动来恢复运动矢量
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237648
Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee
In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.
本文提出了一种通过对相邻块的运动均匀性进行分类来恢复丢失或错误的运动矢量的新方法。根据MV的方向对相邻块的MV进行分类,并确定每一类的代表值,从而选择失真最小的候选MV。实验结果表明,该算法在许多情况下都比现有方法具有更好的性能。
{"title":"Recovery of motion vectors by detecting homogeneous movements for H.263 video communications","authors":"Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee","doi":"10.1109/ICME.2001.1237648","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237648","url":null,"abstract":"In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122987376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Toy interface for multimodal interaction and communication 玩具接口用于多模态交互和通信
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237805
K. Mase
Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.
玩具接口是一个面向现实世界的接口,它使用具有“玩具”形状和属性的建模对象作为现实世界和网络空间之间的接口。玩具界面可分为以下三种类型:娃娃型、盆景型和砖型。我们研究了各种玩具接口,并提出了一个娃娃型接口原型的设计细节,以实现多模态交互和通信。
{"title":"Toy interface for multimodal interaction and communication","authors":"K. Mase","doi":"10.1109/ICME.2001.1237805","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237805","url":null,"abstract":"Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124157400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wayfinding and navigation in haptic virtual environments 触觉虚拟环境中的寻路和导航
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237781
S. Semwal
Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.
认知地图是空间环境中相对位置和属性现象的心理模型。形成认知地图的能力是大自然与生俱来的天赋之一。缺乏这种能力会造成严重的后果,例如对视力受损的人。触觉成为视障人士形成认知地图的主要来源。认知地图一旦形成,就能提供物理世界的精确地图,这样视障人士就能在最少的帮助下成功导航。然而,传统的移动性训练非常耗时,而且对于盲人来说,在训练结束后形成的认知地图很难表达或重访。该触觉环境将允许视障人士将认知地图表达为3D表面地图,并由两个PHANToM力反馈设备引导他们。看护人可以对3D图像进行微调,然后视障人士可以再次感知,从而形成精确的认知地图。除了语音解说外,盲人熟悉的预先存在的形状库将在导航过程中提供方向和本体感觉触觉提示。认知地图的图形显示将向护理人员或培训师提供反馈。由于触觉环境可以很容易地存储和检索,MoVE系统还将鼓励盲人在自己方便的时候和家人一起导航。
{"title":"Wayfinding and navigation in haptic virtual environments","authors":"S. Semwal","doi":"10.1109/ICME.2001.1237781","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237781","url":null,"abstract":"Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128990199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multimedia materials for teaching signal processing 用于信号处理教学的多媒体材料
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237885
X. Huang, G. Woolsey
Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.
基于个人电脑的多媒体技术迅速发展的能力为教育材料的传送提供了新的机会。多媒体技术正在新英格兰大学电子和通信专业的所有级别课程中引入。本文以信号处理四年级教学包为例,说明多媒体技术的应用。我们在正式的课堂教学中使用多媒体教学包,并鼓励学生在自己的个人学习和项目中使用该技术,以提高他们的工程通用技能。合资企业的成功鼓励我们将该技术扩展到UNE工程项目中的其他选定单元。
{"title":"Multimedia materials for teaching signal processing","authors":"X. Huang, G. Woolsey","doi":"10.1109/ICME.2001.1237885","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237885","url":null,"abstract":"Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129399092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic based retrieval model for digital audio and video 基于语义的数字音视频检索模型
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237924
S. Nepal, Uma Srinivasan, G. Reynolds
Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as "bright color" and "very loud sound". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.
最近的基于内容的检索系统,如QBIC[7]和VisualSEEk[8]使用低级的视听特征,如颜色、平移、缩放和响度进行检索。然而,用户更喜欢使用基于他们感知的高级语义来检索视频,例如“明亮的颜色”和“非常响亮的声音”。这导致了用户想要的和系统能生成的之间的差距。本文试图通过将用户的感知(语义概念)映射到低级特征值来弥合这一差距。本文提出了一个模型,为决定响度的音频特征提供高级语义。我们首先进行了一个试点用户研究,以捕捉用户对声音效果的音频剪辑集合的响度水平的感知,并将它们映射到五个不同的语义术语。然后,我们描述了MPEG-1第二层音频文件中的响度测量如何映射到用户感知的响度。然后,我们设计了一种模糊技术,用于使用这些语义术语从集合中检索音频/视频剪辑。
{"title":"Semantic based retrieval model for digital audio and video","authors":"S. Nepal, Uma Srinivasan, G. Reynolds","doi":"10.1109/ICME.2001.1237924","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237924","url":null,"abstract":"Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as \"bright color\" and \"very loud sound\". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115725735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic caption localization in videos using salient points 在视频中使用突出点自动定位字幕
Pub Date : 2001-08-22 DOI: 10.1109/ICME.2001.1237657
M. Bertini, C. Colombo, A. Bimbo
Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.
广播公司正表现出对建立其资产的数字档案的兴趣,以便在电视节目、在线可用性和存档中重新使用档案材料。这就需要视频索引和检索工具,通过利用高级视频信息(如包含在叠加文本标题中的信息)进行内容检索。本文提出了一种利用视频帧中突出点的时空局部属性来自动检测和定位数字视频中的字幕的方法。给出并讨论了在高分辨率dv序列和标准VHS视频上的实验结果。
{"title":"Automatic caption localization in videos using salient points","authors":"M. Bertini, C. Colombo, A. Bimbo","doi":"10.1109/ICME.2001.1237657","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237657","url":null,"abstract":"Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125552656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
期刊
IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1