首页 > 最新文献

2005 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
Lossless image compression with tree coding of magnitude levels 无损图像压缩与树编码的幅度水平
Pub Date : 2005-07-25 DOI: 10.1109/ICME.2005.1521508
Hua Cai, Jiang Li
With the rapid development of digital technology in consumer electronics, the demand to preserve raw image data for further editing or repeated compression is increasing. Traditional lossless image coders usually consist of computationally intensive modeling and entropy coding phases, therefore might not be suitable to mobile devices or scenarios with a strict real-time requirement. This paper presents a new image coding algorithm based on a simple architecture that is easy to model and encode the residual samples. In the proposed algorithm, each residual sample is separated into three parts: (1) a sign value, (2) a magnitude value, and (3) a magnitude level. A tree structure is then used to organize the magnitude levels. By simply coding the tree and the other two parts without any complicated modeling and entropy coding, good performance can be achieved with very low computational cost in the binary-uncoded mode. Moreover, with the aid of context-based arithmetic coding, the magnitude values are further compressed in the arithmetic-coded mode. This gives close performance to JPEG-LS and JPEG2000.
随着消费电子领域数字技术的快速发展,对保存原始图像数据以供进一步编辑或重复压缩的需求日益增加。传统的无损图像编码器通常由计算密集型的建模和熵编码阶段组成,因此可能不适合移动设备或对实时性要求严格的场景。本文提出了一种基于简单结构的图像编码算法,该算法易于对残差样本进行建模和编码。在该算法中,每个残差样本被分成三个部分:(1)一个符号值,(2)一个幅度值,(3)一个幅度水平。然后使用树形结构来组织大小级别。通过简单地对树和其他两部分进行编码,无需进行复杂的建模和熵编码,可以在二进制非编码模式下以极低的计算成本获得良好的性能。此外,借助基于上下文的算术编码,以算术编码的方式进一步压缩幅度值。这使性能接近JPEG-LS和JPEG2000。
{"title":"Lossless image compression with tree coding of magnitude levels","authors":"Hua Cai, Jiang Li","doi":"10.1109/ICME.2005.1521508","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521508","url":null,"abstract":"With the rapid development of digital technology in consumer electronics, the demand to preserve raw image data for further editing or repeated compression is increasing. Traditional lossless image coders usually consist of computationally intensive modeling and entropy coding phases, therefore might not be suitable to mobile devices or scenarios with a strict real-time requirement. This paper presents a new image coding algorithm based on a simple architecture that is easy to model and encode the residual samples. In the proposed algorithm, each residual sample is separated into three parts: (1) a sign value, (2) a magnitude value, and (3) a magnitude level. A tree structure is then used to organize the magnitude levels. By simply coding the tree and the other two parts without any complicated modeling and entropy coding, good performance can be achieved with very low computational cost in the binary-uncoded mode. Moreover, with the aid of context-based arithmetic coding, the magnitude values are further compressed in the arithmetic-coded mode. This gives close performance to JPEG-LS and JPEG2000.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130887935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A new approach for real time motion imation using robust statistics and MPEG domain applied to mosaic images construction 将鲁棒统计和MPEG域应用于拼接图像的构建,提出了一种实时运动模拟的新方法
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521444
Lluis Barceló, R. L. Felip, Xavier Binefa
Dominant motion estimation in video sequence is a task that must be often be solved in computer vision problems but involves a high computational cost due to the overwhelming amount of data to be treated when working in image domain. In this paper we introduce a novel technique to perform motion analysis in video sequences taking advantage of the motion information of MPEG streams and its structure, using imaginary line tracking and robust statistics to overcome the noise present in compressed domain information. In order to demonstrate the reliability of our new approach, we also show the results of its application to mosaic image construction problem.
视频序列的优势运动估计是计算机视觉问题中必须经常解决的问题,但由于在图像域工作时需要处理的数据量巨大,计算成本很高。本文介绍了一种利用MPEG流的运动信息及其结构对视频序列进行运动分析的新技术,利用虚线跟踪和鲁棒统计来克服压缩域信息中的噪声。为了证明该方法的可靠性,我们还展示了该方法在拼接图像构造问题中的应用结果。
{"title":"A new approach for real time motion imation using robust statistics and MPEG domain applied to mosaic images construction","authors":"Lluis Barceló, R. L. Felip, Xavier Binefa","doi":"10.1109/ICME.2005.1521444","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521444","url":null,"abstract":"Dominant motion estimation in video sequence is a task that must be often be solved in computer vision problems but involves a high computational cost due to the overwhelming amount of data to be treated when working in image domain. In this paper we introduce a novel technique to perform motion analysis in video sequences taking advantage of the motion information of MPEG streams and its structure, using imaginary line tracking and robust statistics to overcome the noise present in compressed domain information. In order to demonstrate the reliability of our new approach, we also show the results of its application to mosaic image construction problem.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115433045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Enhancing curvature scale space features for robust shape classification 增强曲率尺度空间特征,实现鲁棒形状分类
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521464
S. Kopf, T. Haenselmann, W. Effelsberg
The curvature scale space (CSS) technique, which is also part of the MPEG-7 standard, is a robust method to describe complex shapes. The central idea is to analyze the curvature of a shape and derive features from inflection points. A major drawback of the CSS method is its poor representation of convex segments: Convex objects cannot be represented at all due to missing inflection points. We have extended the CSS approach to generate feature points for concave and convex segments of a shape. This generic approach is applicable to arbitrary objects. In the experimental results, we evaluate as a comprehensive example the automatic recognition of characters in images and videos.
曲率尺度空间(CSS)技术是描述复杂形状的一种鲁棒方法,也是MPEG-7标准的一部分。其核心思想是分析形状的曲率,并从拐点导出特征。CSS方法的一个主要缺点是它对凸段的表示很差:由于缺少拐点,凸对象根本无法表示。我们扩展了CSS方法来为形状的凹段和凸段生成特征点。这种通用方法适用于任意对象。在实验结果中,我们对图像和视频中字符的自动识别进行了综合评价。
{"title":"Enhancing curvature scale space features for robust shape classification","authors":"S. Kopf, T. Haenselmann, W. Effelsberg","doi":"10.1109/ICME.2005.1521464","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521464","url":null,"abstract":"The curvature scale space (CSS) technique, which is also part of the MPEG-7 standard, is a robust method to describe complex shapes. The central idea is to analyze the curvature of a shape and derive features from inflection points. A major drawback of the CSS method is its poor representation of convex segments: Convex objects cannot be represented at all due to missing inflection points. We have extended the CSS approach to generate feature points for concave and convex segments of a shape. This generic approach is applicable to arbitrary objects. In the experimental results, we evaluate as a comprehensive example the automatic recognition of characters in images and videos.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124885225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Texture-Based Remote-Sensing Image Segmentation 基于纹理的遥感图像分割
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521710
Dihua Guo, V. Atluri, N. Adam
Typically, high-resolution remote sensing (HRRS) images contain a high level noise as well as possess different texture scales. As a result, existing image segmentation approaches are not suitable to HRRS imagery. In this paper, we have presented an unsupervised texture-based segmentation algorithm suitable for HRRS images, by extending the local binary pattern texture features and the lossless wavelet transform. Our experimental results using USGS 1 ft orthoimagery show a significant improvement over the previously proposed LBP approach
通常,高分辨率遥感(HRRS)图像包含高水平的噪声,并且具有不同的纹理尺度。因此,现有的图像分割方法并不适用于HRRS图像。本文通过扩展局部二值模式纹理特征和无损小波变换,提出了一种适用于HRRS图像的无监督纹理分割算法。我们使用USGS 1英尺正射影成像的实验结果显示,与之前提出的LBP方法相比,有了显著的改进
{"title":"Texture-Based Remote-Sensing Image Segmentation","authors":"Dihua Guo, V. Atluri, N. Adam","doi":"10.1109/ICME.2005.1521710","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521710","url":null,"abstract":"Typically, high-resolution remote sensing (HRRS) images contain a high level noise as well as possess different texture scales. As a result, existing image segmentation approaches are not suitable to HRRS imagery. In this paper, we have presented an unsupervised texture-based segmentation algorithm suitable for HRRS images, by extending the local binary pattern texture features and the lossless wavelet transform. Our experimental results using USGS 1 ft orthoimagery show a significant improvement over the previously proposed LBP approach","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125812282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Combining Caption and Visual Features for Semantic Event Classification of Baseball Video 结合字幕和视觉特征的棒球视频语义事件分类
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521656
W. Lie, Sheng-Hsiung Shia
In baseball game, an event is defined as the portion of video clip between two pitches, and a play is defined as a batter finishing his plate appearance. A play is a concatenation of many events, and a baseball game is formed by a series of plays. In this paper, only the event happened in the last pitch of a plate appearance is detected. It is then semantically classified to represent the corresponding play by using an algorithm integrating caption rule-inference and visual feature analysis. Our proposed system is capable of classifying each baseball play into eleven semantic categories, which are popular and familiar to most of the audiences. In an experiment of 260 testing plays, the classification rate achieves up to 87%
在棒球比赛中,一个事件被定义为两个投球之间的视频片段,一个比赛被定义为击球手完成他的本垒出场。一场比赛是许多事件的串联,一场棒球比赛是由一系列的比赛组成的。在本文中,只检测在一个板的外观的最后一个间距发生的事件。然后使用字幕规则推理和视觉特征分析相结合的算法对其进行语义分类,以表示相应的游戏。我们提出的系统能够将每场棒球比赛分为11个语义类别,这些类别对大多数观众来说都是流行和熟悉的。在260个测试层的实验中,分类率达到87%
{"title":"Combining Caption and Visual Features for Semantic Event Classification of Baseball Video","authors":"W. Lie, Sheng-Hsiung Shia","doi":"10.1109/ICME.2005.1521656","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521656","url":null,"abstract":"In baseball game, an event is defined as the portion of video clip between two pitches, and a play is defined as a batter finishing his plate appearance. A play is a concatenation of many events, and a baseball game is formed by a series of plays. In this paper, only the event happened in the last pitch of a plate appearance is detected. It is then semantically classified to represent the corresponding play by using an algorithm integrating caption rule-inference and visual feature analysis. Our proposed system is capable of classifying each baseball play into eleven semantic categories, which are popular and familiar to most of the audiences. In an experiment of 260 testing plays, the classification rate achieves up to 87%","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125277123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
H.264/AVC interleaving for 3G wireless video streaming H.264/AVC交织3G无线视频流
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521561
T. Schierl, M. Kampmann, T. Wiegand
We present a streaming system that utilizes interleaved transmission for real-time H.264/AVC video in 3G wireless environments with benefits shown especially in the presence of link outages. In the 3GPP packet-switched streaming Service Rel. 6, H.264/AVC and its RTP payload format are specified. The RTP payload format allows interleaved transmission of NAL units of H.264/AVC. Our simulations also include audio into the interleaving framework and are conducted within a testbed that emulates a 3G network including block error rates on the physical layer, a buffer for retransmission on the link layer for different error rates, and link outages. The experimental results demonstrate the superior performance of interleaving for typical link outage settings.
我们提出了一种在3G无线环境中利用交错传输实时H.264/AVC视频的流媒体系统,特别是在链路中断的情况下。在3GPP分组交换流媒体服务Rel. 6中,指定了H.264/AVC及其RTP有效载荷格式。RTP有效载荷格式允许H.264/AVC NAL单元的交错传输。我们的模拟还将音频包含在交织框架中,并在模拟3G网络的测试平台中进行,包括物理层上的块错误率,链路层上用于不同错误率的重传缓冲区以及链路中断。实验结果表明,在典型的链路中断情况下,交错传输具有优越的性能。
{"title":"H.264/AVC interleaving for 3G wireless video streaming","authors":"T. Schierl, M. Kampmann, T. Wiegand","doi":"10.1109/ICME.2005.1521561","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521561","url":null,"abstract":"We present a streaming system that utilizes interleaved transmission for real-time H.264/AVC video in 3G wireless environments with benefits shown especially in the presence of link outages. In the 3GPP packet-switched streaming Service Rel. 6, H.264/AVC and its RTP payload format are specified. The RTP payload format allows interleaved transmission of NAL units of H.264/AVC. Our simulations also include audio into the interleaving framework and are conducted within a testbed that emulates a 3G network including block error rates on the physical layer, a buffer for retransmission on the link layer for different error rates, and link outages. The experimental results demonstrate the superior performance of interleaving for typical link outage settings.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126773582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Gridmedia: A Multi-Sender Based Peer-to-Peer Multicast System for Video Streaming Gridmedia:一个基于多发送方的视频流点对点多播系统
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521498
Meng Zhang, Yun Tang, Li Zhao, Jian-Guang Luo, Shiqiang Yang
We present a novel single source peer-to-peer multicast architecture called GridMedia which mainly consists of 1) multi-sender based overlay multicast protocol (MSOMP) and 2) multi-sender based redundancy-retransmitting algorithm (MSRRA). The MSOMP deploys mesh-based two-layer structure and groups all the peers into clusters with multiple distinct paths from the source root to each peer. To address the problem of long burst packet loss, the MSRRA is proposed at the sender peers to patch the lost packets by using receiver peer loss pattern prediction. Consequently, GridMedia provides a scalable and reliable video streaming system for a large and highly dynamic population of end hosts, and ensures the quality of service in terms of continuous playback, bandwidth demanding and low latency. A real experimental system based on GridMedia architecture has been constructed over CERNET and broadcasting TV programs for seven months. More than 140,000 end users have been attracted with almost 600 simultaneously being online at Aug 2004 during Athens Olympic Games
提出了一种新的单源点对点组播架构GridMedia,该架构主要由基于多发送方的覆盖组播协议(MSOMP)和基于多发送方的冗余重传算法(MSRRA)组成。MSOMP采用基于网格的两层结构,将所有对等体分组成集群,从源根到每个对等体有多条不同的路径。为了解决长突发丢包问题,提出了在发送端利用接收端丢包模式预测来修补丢包的MSRRA机制。因此,GridMedia为大量高度动态的终端主机提供了可扩展和可靠的视频流系统,并确保了连续播放、带宽要求和低延迟方面的服务质量。在CERNET上构建了一个基于GridMedia架构的真实实验系统,并进行了为期7个月的电视节目播出。2004年8月雅典奥运会期间吸引了超过14万终端用户,其中近600人同时在线
{"title":"Gridmedia: A Multi-Sender Based Peer-to-Peer Multicast System for Video Streaming","authors":"Meng Zhang, Yun Tang, Li Zhao, Jian-Guang Luo, Shiqiang Yang","doi":"10.1109/ICME.2005.1521498","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521498","url":null,"abstract":"We present a novel single source peer-to-peer multicast architecture called GridMedia which mainly consists of 1) multi-sender based overlay multicast protocol (MSOMP) and 2) multi-sender based redundancy-retransmitting algorithm (MSRRA). The MSOMP deploys mesh-based two-layer structure and groups all the peers into clusters with multiple distinct paths from the source root to each peer. To address the problem of long burst packet loss, the MSRRA is proposed at the sender peers to patch the lost packets by using receiver peer loss pattern prediction. Consequently, GridMedia provides a scalable and reliable video streaming system for a large and highly dynamic population of end hosts, and ensures the quality of service in terms of continuous playback, bandwidth demanding and low latency. A real experimental system based on GridMedia architecture has been constructed over CERNET and broadcasting TV programs for seven months. More than 140,000 end users have been attracted with almost 600 simultaneously being online at Aug 2004 during Athens Olympic Games","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115333653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Multiple Objective Frame Rate Up Conversion 多目标帧率提升转换
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521408
T. Chong, O. Au, Wing-San Chau, Tai-Wai Chan
In this paper, we propose a multiple objective frame rate up conversion algorithm (MOFRUC), which utilizes two different models. The first model is a constant velocity model that assumes the objects position is a linear function of time. The second model exploits the spatial correlation between neighboring blocks, and assumes the pixel intensity is highly correlated in a small local area. In this model, the perceptual quality of interpolated frame is also taken into account and the blocking artifact is minimized. Our proposed MOFRUC estimates the motion trajectory by the first model and interpolate the frame along the motion trajectory. At the same time, the algorithm refines the motion trajectory by maximizing a spatial correlation measurement defined in the second model and interpolates the frame with minimum blocking artifact. Simulation results show that our proposed MOFRUC outperforms other existing algorithms and produces high quality interpolated frame.
本文提出了一种采用两种不同模型的多目标帧率提升转换算法(MOFRUC)。第一个模型是等速模型,它假设物体的位置是时间的线性函数。第二个模型利用相邻块之间的空间相关性,并假设像素强度在一个小的局部区域内高度相关。该模型还考虑了插值帧的感知质量,最大限度地减少了块效应。我们提出的MOFRUC通过第一个模型估计运动轨迹,并沿运动轨迹插值帧。同时,该算法通过最大化第二个模型中定义的空间相关度量来细化运动轨迹,并以最小的块伪影插值帧。仿真结果表明,该算法的插值效果优于现有的插值算法,能够产生高质量的插值帧。
{"title":"Multiple Objective Frame Rate Up Conversion","authors":"T. Chong, O. Au, Wing-San Chau, Tai-Wai Chan","doi":"10.1109/ICME.2005.1521408","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521408","url":null,"abstract":"In this paper, we propose a multiple objective frame rate up conversion algorithm (MOFRUC), which utilizes two different models. The first model is a constant velocity model that assumes the objects position is a linear function of time. The second model exploits the spatial correlation between neighboring blocks, and assumes the pixel intensity is highly correlated in a small local area. In this model, the perceptual quality of interpolated frame is also taken into account and the blocking artifact is minimized. Our proposed MOFRUC estimates the motion trajectory by the first model and interpolate the frame along the motion trajectory. At the same time, the algorithm refines the motion trajectory by maximizing a spatial correlation measurement defined in the second model and interpolates the frame with minimum blocking artifact. Simulation results show that our proposed MOFRUC outperforms other existing algorithms and produces high quality interpolated frame.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115530585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Relevance Feedback Methods in Content Based Retrieval and Video Summarization 基于内容检索和视频摘要的相关反馈方法
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521602
Micha Haas, Ard A. J. Oerlemans, M. Lew
In the current state-of-the-art in multimedia content analysis (MCA), the fundamental techniques are typically derived from core pattern recognition and computer vision algorithms. It is well known that completely automatic pattern recognition and computer vision approaches have not been successful in being robust and domain independent so we should not expect more from MCA algorithms. The exception to this would naturally be methods which are human-interactive or not automatic. In this paper, we describe some of the recent work we have done in multimedia content analysis across multiple domains where the fundamental technique is founded in interactive search. Our novel algorithm integrates our previous work from wavelet based salient points and genetic algorithms and shows that the main contribution and improvement is from the user feedback provided by the interactive search
在当前的多媒体内容分析(MCA)中,基本的技术通常来源于核心的模式识别和计算机视觉算法。众所周知,完全自动模式识别和计算机视觉方法在鲁棒性和领域独立性方面还没有取得成功,所以我们不应该对MCA算法有更多的期望。当然,这种情况的例外是人机交互或非自动的方法。在本文中,我们描述了我们最近在跨多个领域的多媒体内容分析方面所做的一些工作,其中基本技术建立在交互式搜索中。我们的新算法整合了我们以前的基于小波的突出点和遗传算法的工作,并表明交互搜索提供的用户反馈是主要的贡献和改进
{"title":"Relevance Feedback Methods in Content Based Retrieval and Video Summarization","authors":"Micha Haas, Ard A. J. Oerlemans, M. Lew","doi":"10.1109/ICME.2005.1521602","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521602","url":null,"abstract":"In the current state-of-the-art in multimedia content analysis (MCA), the fundamental techniques are typically derived from core pattern recognition and computer vision algorithms. It is well known that completely automatic pattern recognition and computer vision approaches have not been successful in being robust and domain independent so we should not expect more from MCA algorithms. The exception to this would naturally be methods which are human-interactive or not automatic. In this paper, we describe some of the recent work we have done in multimedia content analysis across multiple domains where the fundamental technique is founded in interactive search. Our novel algorithm integrates our previous work from wavelet based salient points and genetic algorithms and shows that the main contribution and improvement is from the user feedback provided by the interactive search","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122732418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Proxy-based reference picture selection for real-time video transmission over mobile networks 基于代理的移动网络实时视频传输参考图片选择
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521422
Wei Tu, E. Steinbach
We propose a framework for error robust real-time video transmission over wireless networks. In our approach, we cope with packet loss on the downlink by retransmitting lost packets from the base station (BS) to the receiver for error recovery. Retransmissions are enabled by using fixed-distance reference picture selection during encoding with a prediction distance that corresponds to the round-trip-time between the BS and the receiver. We deal with transmission errors on the uplink by sending acknowledgements and predicting the next frame from the most recent frame that has been positively acknowledged by the BS. We show that these two separate approaches for uplink and downlink nicely fit together. We compare our approach to state-of-the art error resilience approaches that employ random intra update of macroblocks and FEC across packets for error resilience. At the same bit rate and packet loss rate we observe improvements of up to 4.5 dB for our scheme.
提出了一种差错鲁棒实时视频无线传输框架。在我们的方法中,我们通过将丢失的数据包从基站(BS)重传到接收器以进行错误恢复来处理下行链路上的数据包丢失。通过在编码期间使用固定距离参考图片选择以及与BS和接收器之间的往返时间相对应的预测距离,可以实现重传。我们通过发送确认来处理上行链路上的传输错误,并从已被BS积极确认的最新帧预测下一帧。我们表明,这两种独立的上行链路和下行链路很好地配合在一起。我们将我们的方法与最先进的错误恢复方法进行比较,这些方法采用宏块的随机内部更新和跨数据包的FEC来实现错误恢复。在相同的比特率和丢包率下,我们观察到我们的方案的改进高达4.5 dB。
{"title":"Proxy-based reference picture selection for real-time video transmission over mobile networks","authors":"Wei Tu, E. Steinbach","doi":"10.1109/ICME.2005.1521422","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521422","url":null,"abstract":"We propose a framework for error robust real-time video transmission over wireless networks. In our approach, we cope with packet loss on the downlink by retransmitting lost packets from the base station (BS) to the receiver for error recovery. Retransmissions are enabled by using fixed-distance reference picture selection during encoding with a prediction distance that corresponds to the round-trip-time between the BS and the receiver. We deal with transmission errors on the uplink by sending acknowledgements and predicting the next frame from the most recent frame that has been positively acknowledged by the BS. We show that these two separate approaches for uplink and downlink nicely fit together. We compare our approach to state-of-the art error resilience approaches that employ random intra update of macroblocks and FEC across packets for error resilience. At the same bit rate and packet loss rate we observe improvements of up to 4.5 dB for our scheme.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122485513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2005 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1