首页 > 最新文献

Proceedings IEEE International Conference on Multimedia Computing and Systems最新文献

英文 中文
IRUS: image retrieval using shape 病毒:图像检索使用形状
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778552
Meirav Adoram, M. Lew
Finding shapes in image databases is a challenging topic in content based retrieval. In this paper the goal is to find database images which contain shapes similar to the query of the user. Unlike most solutions to this problem, the algorithm presented in this paper is meant to cope with changes in rotation, scale, translation, and lossy compression noise. A Java application was built which uses snakes and invariant moments. The GVF snake was used because it has two significant advantages over the traditional snake formulation. First, the GVF snake can fit into concavities, and second, the GVF snake can fit itself to objects using both expansion and contraction of the snake. The objects in the images were segmented with the active contours, and then invariant moments were calculated and compared with a minimum distance classifier. Retrieval quality of the system was measured with respect to original images, rotated images, scaled images, noisy images, and combinations of those distortions.
在基于内容的检索中,在图像数据库中查找形状是一个具有挑战性的课题。本文的目标是找到包含与用户查询相似形状的数据库图像。与此问题的大多数解决方案不同,本文提出的算法旨在处理旋转、缩放、平移和有损压缩噪声的变化。构建了一个使用蛇形和不变矩的Java应用程序。使用GVF蛇是因为它比传统的蛇制剂有两个显著的优点。首先,GVF蛇可以适应凹坑,其次,GVF蛇可以通过蛇的膨胀和收缩来适应物体。利用活动轮廓对图像中的目标进行分割,计算不变矩,并与最小距离分类器进行比较。系统的检索质量相对于原始图像,旋转图像,缩放图像,噪声图像和这些失真的组合进行了测量。
{"title":"IRUS: image retrieval using shape","authors":"Meirav Adoram, M. Lew","doi":"10.1109/MMCS.1999.778552","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778552","url":null,"abstract":"Finding shapes in image databases is a challenging topic in content based retrieval. In this paper the goal is to find database images which contain shapes similar to the query of the user. Unlike most solutions to this problem, the algorithm presented in this paper is meant to cope with changes in rotation, scale, translation, and lossy compression noise. A Java application was built which uses snakes and invariant moments. The GVF snake was used because it has two significant advantages over the traditional snake formulation. First, the GVF snake can fit into concavities, and second, the GVF snake can fit itself to objects using both expansion and contraction of the snake. The objects in the images were segmented with the active contours, and then invariant moments were calculated and compared with a minimum distance classifier. Retrieval quality of the system was measured with respect to original images, rotated images, scaled images, noisy images, and combinations of those distortions.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"23 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116634062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Two types of sound tool for editing speech signal: sound cutter and symbolic sound editor 两种用于编辑语音信号的声音工具:声音切割器和符号声音编辑器
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778642
Tomoko Sagisaka, T. Munakata
In recent years multimedia have been frequently used in almost every part of educational settings. In particular the use of sound has produced much effectiveness in a variety of educational activities. In this demonstration, we show two novel types of sound tool for editing speech signal called by "Sound Cutter" and "Symbolic Sound Editor". These sound tools not only facilitate producing hypermedia teaching/studying materials, but also assist language education effectively. Sound Cutter decomposes automatically a continuous speech signal into several short segments based on the pause positions, and assigns serial ID into them to register with a database. This process allows users to reconstruct the segmented sound easily and reproduce a new file as they like. Next Symbolic Sound Editor splits automatically a speech signal into much smaller segments and put index numbers into them. User can edit the sounds easily, referring the index numbers instead of sound wave form image.
近年来,多媒体已经频繁地应用于教育环境的几乎每一个部分。特别是声音的使用在各种教育活动中产生了很大的效果。在这个演示中,我们展示了两种用于编辑语音信号的新型声音工具,分别是“声音切割器”和“符号声音编辑器”。这些良好的工具不仅有助于制作超媒体教学/学习材料,而且有效地辅助语言教育。声音切割器根据暂停位置自动将连续语音信号分解成若干个短段,并为其分配串行ID,以便与数据库进行登记。这个过程允许用户重建分段的声音很容易,并重现一个新的文件,因为他们喜欢。下一步,符号声音编辑器自动将语音信号分割成更小的片段,并将索引数字放入其中。用户可以很容易地编辑声音,参考指数,而不是声波形式的图像。
{"title":"Two types of sound tool for editing speech signal: sound cutter and symbolic sound editor","authors":"Tomoko Sagisaka, T. Munakata","doi":"10.1109/MMCS.1999.778642","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778642","url":null,"abstract":"In recent years multimedia have been frequently used in almost every part of educational settings. In particular the use of sound has produced much effectiveness in a variety of educational activities. In this demonstration, we show two novel types of sound tool for editing speech signal called by \"Sound Cutter\" and \"Symbolic Sound Editor\". These sound tools not only facilitate producing hypermedia teaching/studying materials, but also assist language education effectively. Sound Cutter decomposes automatically a continuous speech signal into several short segments based on the pause positions, and assigns serial ID into them to register with a database. This process allows users to reconstruct the segmented sound easily and reproduce a new file as they like. Next Symbolic Sound Editor splits automatically a speech signal into much smaller segments and put index numbers into them. User can edit the sounds easily, referring the index numbers instead of sound wave form image.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127545327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PowerDriverDTSS: The advanced demand responsive transport service system PowerDriverDTSS:先进的需求响应运输服务系统
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778639
C. Lastrucci, L. Lastrucci, F. Casati
Powersoft has developed PowerDriverDTSS(R), an hardware and software architecture based on proprietary algorithm for demand transport services with undefined timetable, route and stops. PowerDriverDTSS(R) resolves the major problems of this type of service as the real time customers booking with the optimization of the vehicle path, guaranteeing an high quality standard for the service. This system is under evaluation by an Italian public bus transport operator.
Powersoft开发了PowerDriverDTSS(R),这是一种基于专有算法的硬件和软件架构,用于未定义时间表、路线和站点的需求运输服务。PowerDriverDTSS(R)通过优化车辆路径,解决了这类服务中客户实时预约的主要问题,保证了服务的高质量标准。该系统正在由一家意大利公共汽车运输运营商进行评估。
{"title":"PowerDriverDTSS: The advanced demand responsive transport service system","authors":"C. Lastrucci, L. Lastrucci, F. Casati","doi":"10.1109/MMCS.1999.778639","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778639","url":null,"abstract":"Powersoft has developed PowerDriverDTSS(R), an hardware and software architecture based on proprietary algorithm for demand transport services with undefined timetable, route and stops. PowerDriverDTSS(R) resolves the major problems of this type of service as the real time customers booking with the optimization of the vehicle path, guaranteeing an high quality standard for the service. This system is under evaluation by an Italian public bus transport operator.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124701744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The THISL spoken document retrieval project THISL口语文档检索项目
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778655
S. Renals
THISL is an ESPRIT Long Term Research Project focused on the automatic indexing and retrieval of broadcast television and radio programmes. In particular it is concerned with the production of a demonstrator news-on-demand system to navigate an archive of BBC news broadcasts. Prototype systems based on both British and North American broadcast news have been constructed. The North American system has been successfully evaluated within the framework of the TREC-6 and TREC-7 spoken document retrieval tracks, and the system based on BBC TV and radio news archives will be evaluated by BBC R&D.
这是一个ESPRIT长期研究项目,重点是广播电视和广播节目的自动索引和检索。它特别关注的是制作一个示范新闻点播系统,以浏览BBC新闻广播的档案。已经建立了基于英国和北美广播新闻的原型系统。北美系统已在TREC-6和TREC-7口语文档检索轨道框架内进行了成功评估,基于BBC电视和广播新闻档案的系统将由BBC研发进行评估。
{"title":"The THISL spoken document retrieval project","authors":"S. Renals","doi":"10.1109/MMCS.1999.778655","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778655","url":null,"abstract":"THISL is an ESPRIT Long Term Research Project focused on the automatic indexing and retrieval of broadcast television and radio programmes. In particular it is concerned with the production of a demonstrator news-on-demand system to navigate an archive of BBC news broadcasts. Prototype systems based on both British and North American broadcast news have been constructed. The North American system has been successfully evaluated within the framework of the TREC-6 and TREC-7 spoken document retrieval tracks, and the system based on BBC TV and radio news archives will be evaluated by BBC R&D.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124771708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The design of multimedia languages based on teleaction objects 基于远程对象的多媒体语言设计
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778585
G. Polese, Shi-Kuo Chang, G. Tortora
We present a design methodology for multidimensional languages to be used in multimedia applications. The design framework extends methodologies for visual language design and relies on Teleaction Objects as a model for specifying and controlling multimedia presentations.
我们提出了一种在多媒体应用中使用多维语言的设计方法。设计框架扩展了视觉语言设计的方法,并依赖Teleaction Objects作为指定和控制多媒体表示的模型。
{"title":"The design of multimedia languages based on teleaction objects","authors":"G. Polese, Shi-Kuo Chang, G. Tortora","doi":"10.1109/MMCS.1999.778585","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778585","url":null,"abstract":"We present a design methodology for multidimensional languages to be used in multimedia applications. The design framework extends methodologies for visual language design and relies on Teleaction Objects as a model for specifying and controlling multimedia presentations.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123249161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Three dimensional wavelet transform video compression 三维小波变换视频压缩
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778612
Ian Karl Levy, R. Wilson
A new approach to vector quantizer (VQ) codebook design for video data compression is described. This is based on the notion that symmetries in the data, which are seldom captured exactly in any training dataset, are both important perceptually and can lead to a more robust and effective codebook. The idea is illustrated using a 3D wavelet transformed video sequence. After discussing the relevant symmetries, a codebook design method is presented, based on a modification of the Linde-Buzo-Gray (1980) algorithm. This is applied to various video sequences. Comparisons drawn with other work in the area demonstrate that the scheme has potential and is worthy of further investigation.
提出了一种用于视频数据压缩的矢量量化码本设计新方法。这是基于这样一个概念,即数据中的对称性在任何训练数据集中都很少被准确捕获,这在感知上都很重要,并且可以导致更健壮和有效的码本。用三维小波变换视频序列说明了这一思想。在讨论了相关的对称性之后,提出了一种基于Linde-Buzo-Gray(1980)算法修改的码本设计方法。这适用于各种视频序列。与该地区其他工作的比较表明,该方案具有潜力,值得进一步研究。
{"title":"Three dimensional wavelet transform video compression","authors":"Ian Karl Levy, R. Wilson","doi":"10.1109/MMCS.1999.778612","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778612","url":null,"abstract":"A new approach to vector quantizer (VQ) codebook design for video data compression is described. This is based on the notion that symmetries in the data, which are seldom captured exactly in any training dataset, are both important perceptually and can lead to a more robust and effective codebook. The idea is illustrated using a 3D wavelet transformed video sequence. After discussing the relevant symmetries, a codebook design method is presented, based on a modification of the Linde-Buzo-Gray (1980) algorithm. This is applied to various video sequences. Comparisons drawn with other work in the area demonstrate that the scheme has potential and is worthy of further investigation.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123508511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Adding expressiveness in musical performance in real time 实时增加音乐表演的表现力
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778645
A. Rodà, S. Canazza
Musical performance introduces some deviations from nominal values specified in the score. Music reproduced without such variations is usually perceived as mechanical. Most investigations explore how the musical structure influences the performance. There are a few studies on how the musician's expressive intentions are reflected in the performance. The purpose of this work is to develop a model for the expressive modification in real time of musical performance. Perceptual analyses were conducted on some performances played with different intentions (correlated with a set of sensorial adjectives). From these analyses, two distinct expressive directions were observed: the first one correlated with "energy" and the second one with the "kinetics" of the pieces. The two-dimensional space (perceptual Parametric Space, PPS) obtained represents how the subjects arranged the pieces in their own minds. Acoustical analysis allowed us to correlate the expressive directions of the PPS with the main acoustic parameters. Each point of PPS is therefore associated with a set of acoustic parameters. Analysis-by-synthesis method was used to validate the model. In order to carry out computer generated performances, we developed a real time software.
音乐表演引入了与乐谱中指定的名义值的一些偏差。没有这种变化的音乐通常被认为是机械的。大多数调查都是探讨音乐结构如何影响演奏。有一些关于音乐家的表达意图如何反映在表演中的研究。本研究的目的是为音乐表演的实时表达修饰建立一个模型。对不同意图的表演进行了知觉分析(与一组感官形容词相关)。从这些分析中,我们观察到两个不同的表达方向:第一个与“能量”相关,第二个与“动力学”相关。获得的二维空间(感知参数空间,PPS)表示受试者如何在自己的脑海中排列碎片。声学分析使我们能够将PPS的表达方向与主要声学参数联系起来。因此,PPS的每个点都与一组声学参数相关联。采用综合分析法对模型进行了验证。为了进行计算机生成的表演,我们开发了一个实时软件。
{"title":"Adding expressiveness in musical performance in real time","authors":"A. Rodà, S. Canazza","doi":"10.1109/MMCS.1999.778645","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778645","url":null,"abstract":"Musical performance introduces some deviations from nominal values specified in the score. Music reproduced without such variations is usually perceived as mechanical. Most investigations explore how the musical structure influences the performance. There are a few studies on how the musician's expressive intentions are reflected in the performance. The purpose of this work is to develop a model for the expressive modification in real time of musical performance. Perceptual analyses were conducted on some performances played with different intentions (correlated with a set of sensorial adjectives). From these analyses, two distinct expressive directions were observed: the first one correlated with \"energy\" and the second one with the \"kinetics\" of the pieces. The two-dimensional space (perceptual Parametric Space, PPS) obtained represents how the subjects arranged the pieces in their own minds. Acoustical analysis allowed us to correlate the expressive directions of the PPS with the main acoustic parameters. Each point of PPS is therefore associated with a set of acoustic parameters. Analysis-by-synthesis method was used to validate the model. In order to carry out computer generated performances, we developed a real time software.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123672651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On the automated interpretation and indexing of American Football 美式足球的自动解译与标引
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.779303
M. Lazarescu, S. Venkatesh, G. West, T. Caelli
Combines natural language understanding and image processing with incremental learning to develop a system that can automatically interpret and index American Football. We have developed a model for representing spatio-temporal characteristics of multiple objects in dynamic scenes in this domain. Our representation combines expert knowledge, domain knowledge, spatial knowledge and temporal knowledge. We also present an incremental learning algorithm to improve the knowledge base as well as to keep previously developed concepts consistent with new data. The advantages of the incremental learning algorithm are that is that it does not split concepts and it generates a compact conceptual hierarchy which does not store instances.
将自然语言理解和图像处理与增量学习相结合,开发一个可以自动解释和索引美式足球的系统。我们开发了一个模型来表示该领域中动态场景中多个对象的时空特征。我们的表示结合了专家知识、领域知识、空间知识和时间知识。我们还提出了一种增量学习算法来改进知识库,并使以前开发的概念与新数据保持一致。增量学习算法的优点是,它不拆分概念,它产生一个紧凑的概念层次结构,不存储实例。
{"title":"On the automated interpretation and indexing of American Football","authors":"M. Lazarescu, S. Venkatesh, G. West, T. Caelli","doi":"10.1109/MMCS.1999.779303","DOIUrl":"https://doi.org/10.1109/MMCS.1999.779303","url":null,"abstract":"Combines natural language understanding and image processing with incremental learning to develop a system that can automatically interpret and index American Football. We have developed a model for representing spatio-temporal characteristics of multiple objects in dynamic scenes in this domain. Our representation combines expert knowledge, domain knowledge, spatial knowledge and temporal knowledge. We also present an incremental learning algorithm to improve the knowledge base as well as to keep previously developed concepts consistent with new data. The advantages of the incremental learning algorithm are that is that it does not split concepts and it generates a compact conceptual hierarchy which does not store instances.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116443133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Relevance feedback techniques for image retrieval using multiple attributes 基于多属性的图像检索相关反馈技术
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.779320
Tat-Seng Chua, Chun-Xin Chu, M. Kankanhalli
The paper proposes a relevance feedback (RF) approach to content based image retrieval using multiple attributes. The proposed approach has been applied to images' text and color attributes. In order to ensure that meaningful features are extracted, a pseudo object model based on color coherence vector has been adopted to model color content. The RF approach employs techniques developed in the fields of information retrieval and machine learning to extract pertinent features from each of the attributes. It then uses the user's relevance judgments to estimate the importance of different attributes in an integrated content based image retrieval. The system developed has been tested on a large image collection containing over 12000 images. The results demonstrate that the proposed RF approaches and pseudo object based color model are effective.
提出了一种基于内容的多属性图像检索的相关反馈方法。该方法已应用于图像的文本和颜色属性。为了保证提取有意义的特征,采用基于颜色相干向量的伪对象模型对颜色内容进行建模。RF方法采用信息检索和机器学习领域发展的技术,从每个属性中提取相关特征。然后,它使用用户的相关性判断来估计基于集成内容的图像检索中不同属性的重要性。所开发的系统已在包含超过12000张图像的大型图像集上进行了测试。结果表明,所提出的射频方法和基于伪目标的颜色模型是有效的。
{"title":"Relevance feedback techniques for image retrieval using multiple attributes","authors":"Tat-Seng Chua, Chun-Xin Chu, M. Kankanhalli","doi":"10.1109/MMCS.1999.779320","DOIUrl":"https://doi.org/10.1109/MMCS.1999.779320","url":null,"abstract":"The paper proposes a relevance feedback (RF) approach to content based image retrieval using multiple attributes. The proposed approach has been applied to images' text and color attributes. In order to ensure that meaningful features are extracted, a pseudo object model based on color coherence vector has been adopted to model color content. The RF approach employs techniques developed in the fields of information retrieval and machine learning to extract pertinent features from each of the attributes. It then uses the user's relevance judgments to estimate the importance of different attributes in an integrated content based image retrieval. The system developed has been tested on a large image collection containing over 12000 images. The results demonstrate that the proposed RF approaches and pseudo object based color model are effective.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122321760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Joint audio-video processing of MPEG encoded sequences MPEG编码序列的联合音视频处理
Pub Date : 1999-06-07 DOI: 10.1109/MMCS.1999.778288
Giuseppe Boccignone, M. D. Santo, G. Percannella
The current research efforts in the field of video parsing and analysis are focused on the use of pictorial information, while neglecting an important supplementary source of content information such as the embedded audio or soundtrack. In contrast, we address the issue of scene change detection with the use of video and audio information. We also discuss how joint exploitation of audio and video can be thoroughly performed on MPEG encoded video sequences. First experimental results are presented and discussed.
目前在视频解析和分析领域的研究工作主要集中在图像信息的使用上,而忽略了内容信息的重要补充来源,如嵌入式音频或配乐。相比之下,我们通过使用视频和音频信息来解决场景变化检测问题。我们还讨论了如何在MPEG编码的视频序列上彻底实现音频和视频的联合利用。首先给出了实验结果并进行了讨论。
{"title":"Joint audio-video processing of MPEG encoded sequences","authors":"Giuseppe Boccignone, M. D. Santo, G. Percannella","doi":"10.1109/MMCS.1999.778288","DOIUrl":"https://doi.org/10.1109/MMCS.1999.778288","url":null,"abstract":"The current research efforts in the field of video parsing and analysis are focused on the use of pictorial information, while neglecting an important supplementary source of content information such as the embedded audio or soundtrack. In contrast, we address the issue of scene change detection with the use of video and audio information. We also discuss how joint exploitation of audio and video can be thoroughly performed on MPEG encoded video sequences. First experimental results are presented and discussed.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122415203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
Proceedings IEEE International Conference on Multimedia Computing and Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1