首页 > 最新文献

2005 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
Events Detection for an Audio-Based Surveillance System 基于音频的监控系统事件检测
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521669
C. Clavel, T. Ehrette, G. Richard
The present research deals with audio events detection in noisy environments for a multimedia surveillance application. In surveillance or homeland security most of the systems aiming to automatically detect abnormal situations are only based on visual clues while, in some situations, it may be easier to detect a given event using the audio information. This is in particular the case for the class of sounds considered in this paper, sounds produced by gun shots. The automatic shot detection system presented is based on a novelty detection approach which offers a solution to detect abnormality (abnormal audio events) in continuous audio recordings of public places. We specifically focus on the robustness of the detection against variable and adverse conditions and the reduction of the false rejection rate which is particularly important in surveillance applications. In particular, we take advantage of potential similarity between the acoustic signatures of the different types of weapons by building a hierarchical classification system
本文研究了多媒体监控中噪声环境下的音频事件检测问题。在监视或国土安全中,大多数旨在自动检测异常情况的系统仅基于视觉线索,而在某些情况下,使用音频信息可能更容易检测给定事件。这对于本文所考虑的声音类别,即枪声所产生的声音来说尤其如此。本文提出的镜头自动检测系统基于新颖性检测方法,为公共场所连续录音的异常(异常音频事件)检测提供了一种解决方案。我们特别关注检测对可变和不利条件的鲁棒性,以及在监视应用中特别重要的误拒率的降低。特别地,我们利用不同类型武器的声特征之间潜在的相似性,建立了一个分层分类系统
{"title":"Events Detection for an Audio-Based Surveillance System","authors":"C. Clavel, T. Ehrette, G. Richard","doi":"10.1109/ICME.2005.1521669","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521669","url":null,"abstract":"The present research deals with audio events detection in noisy environments for a multimedia surveillance application. In surveillance or homeland security most of the systems aiming to automatically detect abnormal situations are only based on visual clues while, in some situations, it may be easier to detect a given event using the audio information. This is in particular the case for the class of sounds considered in this paper, sounds produced by gun shots. The automatic shot detection system presented is based on a novelty detection approach which offers a solution to detect abnormality (abnormal audio events) in continuous audio recordings of public places. We specifically focus on the robustness of the detection against variable and adverse conditions and the reduction of the false rejection rate which is particularly important in surveillance applications. In particular, we take advantage of potential similarity between the acoustic signatures of the different types of weapons by building a hierarchical classification system","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114164888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 345
Video quality analysis for an automated video capturing and editing system for conversation scenes 对话场景自动视频捕获和编辑系统的视频质量分析
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521514
Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura
This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes "conversation states" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.
本文介绍了用于自动视频采集和编辑的视频质量分析。之前,我们提出了一个会话场景的自动视频捕获和编辑系统。在捕捉阶段,我们的系统不仅可以用多个泛倾斜变焦摄像机产生并发视频流,还可以识别“对话状态”,即谁在说话,什么时候有人在点头等。由于了解自动编辑阶段的对话状态是必要的,因此澄清对话属性的识别率如何影响我们的编辑系统对最终视频质量的影响是很重要的。在本研究中,我们通过主观评价实验分析了会话状态识别率与生成视频质量之间的关系。所得视频的质量分数几乎与手动识别的最佳情况相同,因此我们的捕获系统的识别率是足够的。
{"title":"Video quality analysis for an automated video capturing and editing system for conversation scenes","authors":"Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura","doi":"10.1109/ICME.2005.1521514","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521514","url":null,"abstract":"This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes \"conversation states\" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121717382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Multimedian Concert-Video Browser 多媒体音乐会视频浏览器
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521732
Y. V. Houten, S. U. Naci, Bauke Freiburg, R. Eggermont, Sander Schuurman, Danny Hollander, J. Reitsma, Maurice Markslag, Justin Kniest, Mattina Veenstra, A. Hanjalic
The MultimediaN concert-video browser demonstrates a video interaction environment for efficiently browsing video registrations of pop, rock and other music concerts. The exhibition displays the current state of the project for developing an advanced concert-video browser in 2007. Three demos are provided: 1) a high-level content analysis methodology for modeling the "experience" of the concert at its different stages, and for automatically detecting and identifying semantically coherent temporal segments in concert videos, 2) a general-purpose video editor that associates semantic descriptions with the video segments using both manual and automatic inputs, and a video browser that applies ideas from information foraging theory and demonstrates patch-based video browsing, 3) the Fabplayer, specifically designed for patch-based browsing of concert videos by a dedicated user-group, making use of the results of automatic concert-video segmentation
MultimediaN音乐会视频浏览器演示了一个视频交互环境,用于有效浏览流行音乐,摇滚和其他音乐音乐会的视频注册。此次展览展示了2007年开发先进的音乐会视频浏览器的项目现状。提供了三个演示:1)为演唱会不同阶段的“体验”建模的高级内容分析方法,并自动检测和识别演唱会视频中语义连贯的时间片段;2)使用手动和自动输入将语义描述与视频片段关联起来的通用视频编辑器;以及应用信息觅食理论思想并演示基于补丁的视频浏览的视频浏览器;专门为专门的用户组基于补丁的音乐会视频浏览而设计,利用音乐会视频自动分割的结果
{"title":"The Multimedian Concert-Video Browser","authors":"Y. V. Houten, S. U. Naci, Bauke Freiburg, R. Eggermont, Sander Schuurman, Danny Hollander, J. Reitsma, Maurice Markslag, Justin Kniest, Mattina Veenstra, A. Hanjalic","doi":"10.1109/ICME.2005.1521732","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521732","url":null,"abstract":"The MultimediaN concert-video browser demonstrates a video interaction environment for efficiently browsing video registrations of pop, rock and other music concerts. The exhibition displays the current state of the project for developing an advanced concert-video browser in 2007. Three demos are provided: 1) a high-level content analysis methodology for modeling the \"experience\" of the concert at its different stages, and for automatically detecting and identifying semantically coherent temporal segments in concert videos, 2) a general-purpose video editor that associates semantic descriptions with the video segments using both manual and automatic inputs, and a video browser that applies ideas from information foraging theory and demonstrates patch-based video browsing, 3) the Fabplayer, specifically designed for patch-based browsing of concert videos by a dedicated user-group, making use of the results of automatic concert-video segmentation","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121848207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Adaptive hierarchical multi-class SVM classifier for texture-based image classification 基于纹理图像分类的自适应分层多类SVM分类器
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521640
Song Liu, Haoran Yi, L. Chia, D. Rajan
In this paper, we present a new classification scheme based on support vector machines (SVM) and a new texture feature, called texture correlogram, for high-level image classification. Originally, SVM classifier is designed for solving only binary classification problem. In order to deal with multiple classes, we present a new method to dynamically build up a hierarchical structure from the training dataset. The texture correlogram is designed to capture spatial distribution information. Experimental results demonstrate that the proposed classification scheme and texture feature are effective for high-level image classification task and the proposed classification scheme is more efficient than the other schemes while achieving almost the same classification accuracy. Another advantage of the proposed scheme is that the underlying hierarchical structure of the SVM classification tree manifests the interclass relationships among different classes.
本文提出了一种基于支持向量机(SVM)和纹理相关图(纹理相关图)的高级图像分类方法。最初,支持向量机分类器仅用于解决二值分类问题。为了处理多个类,我们提出了一种从训练数据集动态构建层次结构的新方法。纹理相关图用于捕获空间分布信息。实验结果表明,本文提出的分类方案和纹理特征对高阶图像分类任务是有效的,在达到几乎相同的分类精度的情况下,该分类方案的分类效率高于其他方案。该方案的另一个优点是支持向量机分类树的底层层次结构体现了不同类之间的类间关系。
{"title":"Adaptive hierarchical multi-class SVM classifier for texture-based image classification","authors":"Song Liu, Haoran Yi, L. Chia, D. Rajan","doi":"10.1109/ICME.2005.1521640","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521640","url":null,"abstract":"In this paper, we present a new classification scheme based on support vector machines (SVM) and a new texture feature, called texture correlogram, for high-level image classification. Originally, SVM classifier is designed for solving only binary classification problem. In order to deal with multiple classes, we present a new method to dynamically build up a hierarchical structure from the training dataset. The texture correlogram is designed to capture spatial distribution information. Experimental results demonstrate that the proposed classification scheme and texture feature are effective for high-level image classification task and the proposed classification scheme is more efficient than the other schemes while achieving almost the same classification accuracy. Another advantage of the proposed scheme is that the underlying hierarchical structure of the SVM classification tree manifests the interclass relationships among different classes.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121599729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Automatic mobile sports highlights 自动移动体育集锦
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521504
K. Wan, Xin Yan, Changsheng Xu
We report on our development of a real-time system to deliver sports video highlights of a live game to mobile videophones over existing GPRS networks. To facilitate real-time analysis, a circular buffer receives live video data from which simple audio/visual features are computed to detect for highlight-worthiness according to a priori decision scheme. A separate module runs algorithms to insert content into the highlight for mobile advertising. The system is now under trial over new 3G networks.
我们报告了我们开发的实时系统,通过现有的GPRS网络向移动视频电话提供现场比赛的体育视频集锦。为了便于实时分析,圆形缓冲器接收实时视频数据,从中计算简单的音频/视觉特征,以根据先验决策方案检测突出价值。一个单独的模块运行算法,将内容插入移动广告的高亮部分。该系统目前正在新的3G网络上试用。
{"title":"Automatic mobile sports highlights","authors":"K. Wan, Xin Yan, Changsheng Xu","doi":"10.1109/ICME.2005.1521504","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521504","url":null,"abstract":"We report on our development of a real-time system to deliver sports video highlights of a live game to mobile videophones over existing GPRS networks. To facilitate real-time analysis, a circular buffer receives live video data from which simple audio/visual features are computed to detect for highlight-worthiness according to a priori decision scheme. A separate module runs algorithms to insert content into the highlight for mobile advertising. The system is now under trial over new 3G networks.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116781231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An Efficient Architecture for Lifting-Based Forward and Inverse Discrete Wavelet Transform 基于提升的离散小波正反变换的有效结构
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521548
S. Aroutchelvame, K. Raahemifar
In this research, an architecture that performs both forward and inverse lifting-based discrete wavelet transform is proposed. The proposed architecture reduces the hardware requirement by exploiting the redundancy in the arithmetic operation involved in DWT computation. The proposed architecture does not require any extra memory to store intermediate results. The proposed architecture consists of predict module, update module, address generation module, control unit and a set of registers to establish data communication between predict and update modules. The symmetrical extension of images at the boundary to reduce distorted images has been incorporated in our proposed architecture as mentioned in JPEG2000. This architecture has been described in VHDL at the RTL level and simulated successfully using ModelSim simulation environment
在本研究中,提出了一种同时进行正、逆升力离散小波变换的结构。该架构通过利用DWT计算中涉及的算术运算中的冗余,降低了对硬件的要求。所提出的体系结构不需要任何额外的内存来存储中间结果。该体系结构由预测模块、更新模块、地址生成模块、控制单元和一组寄存器组成,用于在预测模块和更新模块之间建立数据通信。在JPEG2000中,我们提出了在边界处对称扩展图像以减少图像失真的架构。在RTL级别用VHDL描述了该体系结构,并使用ModelSim仿真环境对其进行了成功的仿真
{"title":"An Efficient Architecture for Lifting-Based Forward and Inverse Discrete Wavelet Transform","authors":"S. Aroutchelvame, K. Raahemifar","doi":"10.1109/ICME.2005.1521548","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521548","url":null,"abstract":"In this research, an architecture that performs both forward and inverse lifting-based discrete wavelet transform is proposed. The proposed architecture reduces the hardware requirement by exploiting the redundancy in the arithmetic operation involved in DWT computation. The proposed architecture does not require any extra memory to store intermediate results. The proposed architecture consists of predict module, update module, address generation module, control unit and a set of registers to establish data communication between predict and update modules. The symmetrical extension of images at the boundary to reduce distorted images has been incorporated in our proposed architecture as mentioned in JPEG2000. This architecture has been described in VHDL at the RTL level and simulated successfully using ModelSim simulation environment","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125182223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A Probabilistic Framework for TV-News Stories Detection and Classification 电视新闻故事检测与分类的概率框架
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521680
F. Colace, P. Foggia, G. Percannella
In this paper we face the problem of partitioning the news videos into stories, and of their classification according to a predefined set of categories. In particular, we propose to employ a multi-level probabilistic framework based on the hidden Markov models and the Bayesian networks paradigms for the segmentation and the classification phases, respectively. The whole analysis is carried out exploiting information extracted from the video and the audio tracks using techniques of superimposed text recognition, speaker identification, speech transcription, anchor detection. The system was tested on a database of Italian news videos and the results are very promising
在本文中,我们面临的问题是将新闻视频划分为故事,并根据预定义的类别集对其进行分类。特别地,我们建议在分割和分类阶段分别采用基于隐马尔可夫模型和贝叶斯网络范式的多级概率框架。整个分析是利用叠加文本识别、说话人识别、语音转录、锚点检测等技术从视频和音轨中提取信息进行的。该系统在意大利新闻视频数据库上进行了测试,结果非常有希望
{"title":"A Probabilistic Framework for TV-News Stories Detection and Classification","authors":"F. Colace, P. Foggia, G. Percannella","doi":"10.1109/ICME.2005.1521680","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521680","url":null,"abstract":"In this paper we face the problem of partitioning the news videos into stories, and of their classification according to a predefined set of categories. In particular, we propose to employ a multi-level probabilistic framework based on the hidden Markov models and the Bayesian networks paradigms for the segmentation and the classification phases, respectively. The whole analysis is carried out exploiting information extracted from the video and the audio tracks using techniques of superimposed text recognition, speaker identification, speech transcription, anchor detection. The system was tested on a database of Italian news videos and the results are very promising","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122433903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Object-Based Audio Streaming Over Error-Prone Channels 在容易出错的通道上基于对象的音频流
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521410
Stuart K. Marks, R. González
This paper investigates the benefits of streaming autonomous audio objects over error-prone channels instead of encoded audio frames. Due to the nature of autonomous audio objects such a scheme is error resilient and has a fine-grain scalable bitrate, but also has the additional benefit of being able to disguise packet loss in the reconstructed signal. This paper proposes object-packing algorithms, which will be shown to be able to disguise the presence of long bursts of packet loss, removing the need for complex error-concealment schemes at the decoder
本文研究了在容易出错的通道上流式传输自主音频对象而不是编码音频帧的好处。由于自主音频对象的性质,这种方案具有错误弹性和细粒度可扩展比特率,但也有能够掩盖重建信号中的数据包丢失的额外好处。本文提出了对象打包算法,该算法将被证明能够掩盖长时间数据包丢失的存在,从而消除了解码器中复杂的错误隐藏方案的需要
{"title":"Object-Based Audio Streaming Over Error-Prone Channels","authors":"Stuart K. Marks, R. González","doi":"10.1109/ICME.2005.1521410","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521410","url":null,"abstract":"This paper investigates the benefits of streaming autonomous audio objects over error-prone channels instead of encoded audio frames. Due to the nature of autonomous audio objects such a scheme is error resilient and has a fine-grain scalable bitrate, but also has the additional benefit of being able to disguise packet loss in the reconstructed signal. This paper proposes object-packing algorithms, which will be shown to be able to disguise the presence of long bursts of packet loss, removing the need for complex error-concealment schemes at the decoder","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-Modal Video Concept Extraction Using Co-Training 基于协同训练的多模态视频概念提取
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521473
Rong Yan, M. Naphade
For large scale automatic semantic video characterization, it is necessary to learn and model a large number of semantic concepts. A major obstacle to this is the insufficiency of labeled training samples. Semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data, which allows the redundant information across views to improve the learning performance. Although co-training has been successfully applied in several domains, it has not been used to detect video concepts before. In this paper, we extend co-training to the domain of video concept detection and investigate different strategies of co-training as well as their effects to the detection accuracy. We demonstrate performance based on the guideline of the TRECVID '03 semantic concept extraction task
为了实现大规模的自动语义视频表征,需要对大量的语义概念进行学习和建模。一个主要的障碍是标记训练样本的不足。半监督学习算法(如co-training)可以通过合并大量未标记数据来提供帮助,这些数据允许跨视图的冗余信息来提高学习性能。虽然协同训练已经成功地应用于多个领域,但它还没有被用于检测视频概念。本文将协同训练扩展到视频概念检测领域,研究了不同的协同训练策略及其对检测精度的影响。我们基于TRECVID '03语义概念提取任务的指导来演示性能
{"title":"Multi-Modal Video Concept Extraction Using Co-Training","authors":"Rong Yan, M. Naphade","doi":"10.1109/ICME.2005.1521473","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521473","url":null,"abstract":"For large scale automatic semantic video characterization, it is necessary to learn and model a large number of semantic concepts. A major obstacle to this is the insufficiency of labeled training samples. Semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data, which allows the redundant information across views to improve the learning performance. Although co-training has been successfully applied in several domains, it has not been used to detect video concepts before. In this paper, we extend co-training to the domain of video concept detection and investigate different strategies of co-training as well as their effects to the detection accuracy. We demonstrate performance based on the guideline of the TRECVID '03 semantic concept extraction task","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121875614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Adaptive local context suppression of multiple cues for salient visual attention detection 显著性视觉注意检测中多线索的自适应局部语境抑制
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521431
Yiqun Hu, D. Rajan, L. Chia
Visual attention is obtained through determination of contrasts of low level features or attention cues like intensity, color etc. We propose a new texture attention cue that is shown to be more effective for images where the salient object regions and background have similar visual characteristics. Current visual attention models do not consider local contextual information to highlight attention regions. We also propose a feature combination strategy by suppressing saliency based on context information that is effective in determining the true attention region. We compare our approach with other visual attention models using a novel average discrimination ratio measure.
视觉注意是通过确定低水平特征或注意线索(如强度、颜色等)的对比来获得的。我们提出了一种新的纹理注意线索,该线索被证明对显著物体区域和背景具有相似视觉特征的图像更有效。当前的视觉注意模型没有考虑局部上下文信息来突出注意区域。我们还提出了一种基于上下文信息的特征组合策略,该策略通过抑制显著性来有效地确定真正的注意区域。我们将我们的方法与其他视觉注意模型进行了比较,使用了一种新的平均辨别比测量方法。
{"title":"Adaptive local context suppression of multiple cues for salient visual attention detection","authors":"Yiqun Hu, D. Rajan, L. Chia","doi":"10.1109/ICME.2005.1521431","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521431","url":null,"abstract":"Visual attention is obtained through determination of contrasts of low level features or attention cues like intensity, color etc. We propose a new texture attention cue that is shown to be more effective for images where the salient object regions and background have similar visual characteristics. Current visual attention models do not consider local contextual information to highlight attention regions. We also propose a feature combination strategy by suppressing saliency based on context information that is effective in determining the true attention region. We compare our approach with other visual attention models using a novel average discrimination ratio measure.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127678959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
2005 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1