首页 > 最新文献

2006 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
Video and Audio Editing for Mobile Applications 视频和音频编辑的移动应用程序
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262778
Ari Hourunranta, A. Islam, F. Chebil
Video content creation and consumption have been increasingly available for the masses with the emergence of handheld devices capable of shooting, downloading, and playing videos. Video editing is a natural and necessary operation that is most commonly employed by users for finalizing and organizing their video content. With the constraints in processing power and memory, conventional spatial domain video editing is not a solution for mobile applications. In this paper, we present a complete video editing system for efficiently editing video content on mobile phones using compressed domain editing algorithms. A critical factor from usability point of view is the processing speed of the editing application. We show that with the proposed compressed domain editing system, typical video editing operations can be performed much faster than real-time on today's S60 phones
随着能够拍摄、下载和播放视频的手持设备的出现,视频内容的创作和消费越来越多地为大众所接受。视频编辑是用户完成和组织视频内容最常用的一种自然而必要的操作。由于处理能力和内存的限制,传统的空间域视频编辑不适合移动应用。在本文中,我们提出了一个完整的视频编辑系统,可以使用压缩域编辑算法对手机上的视频内容进行高效编辑。从可用性的角度来看,一个关键因素是编辑应用程序的处理速度。我们表明,使用提出的压缩域编辑系统,典型的视频编辑操作可以比今天的S60手机上的实时操作快得多
{"title":"Video and Audio Editing for Mobile Applications","authors":"Ari Hourunranta, A. Islam, F. Chebil","doi":"10.1109/ICME.2006.262778","DOIUrl":"https://doi.org/10.1109/ICME.2006.262778","url":null,"abstract":"Video content creation and consumption have been increasingly available for the masses with the emergence of handheld devices capable of shooting, downloading, and playing videos. Video editing is a natural and necessary operation that is most commonly employed by users for finalizing and organizing their video content. With the constraints in processing power and memory, conventional spatial domain video editing is not a solution for mobile applications. In this paper, we present a complete video editing system for efficiently editing video content on mobile phones using compressed domain editing algorithms. A critical factor from usability point of view is the processing speed of the editing application. We show that with the proposed compressed domain editing system, typical video editing operations can be performed much faster than real-time on today's S60 phones","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117095129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Approximating Optimal Visual Sensor Placement 近似最佳视觉传感器放置
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262766
E. Hörster, R. Lienhart
Many novel multimedia applications use visual sensor arrays. In this paper we address the problem of optimally placing multiple visual sensors in a given space. Our linear programming approach determines the minimum number of cameras needed to cover the space completely at a given sampling frequency. Simultaneously it determines the optimal positions and poses of the visual sensors. We also show how to account for visual sensors with different properties and costs if more than one kind is available, and report performance results.
许多新颖的多媒体应用都使用了视觉传感器阵列。在本文中,我们解决了在给定空间中最优放置多个视觉传感器的问题。我们的线性规划方法确定了在给定采样频率下完全覆盖空间所需的最小摄像机数量。同时确定视觉传感器的最佳位置和姿态。我们还展示了如何考虑具有不同属性和成本的视觉传感器,如果有多种可用,并报告性能结果。
{"title":"Approximating Optimal Visual Sensor Placement","authors":"E. Hörster, R. Lienhart","doi":"10.1109/ICME.2006.262766","DOIUrl":"https://doi.org/10.1109/ICME.2006.262766","url":null,"abstract":"Many novel multimedia applications use visual sensor arrays. In this paper we address the problem of optimally placing multiple visual sensors in a given space. Our linear programming approach determines the minimum number of cameras needed to cover the space completely at a given sampling frequency. Simultaneously it determines the optimal positions and poses of the visual sensors. We also show how to account for visual sensors with different properties and costs if more than one kind is available, and report performance results.","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125920006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
A Protection Processor for MPEG-21 Players 用于MPEG-21播放器的保护处理器
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262790
P. Nesi, D. Rogai, A. Vallotti
The design and implementation of MPEG-21 players and authoring tools presents several critical points to be solved. One of the most relevant is the security level and protection processing in the players. This paper presents a solution for the realization of components in charge of enforcing Digital Rights Management in AXMEDIS tools for MPEG-21 digital content. The proposed architecture provides functionalities to create both trusted environment on the client side and dynamic protection and unprotection of digital content including digital resources and their organization and metadata. The same solution can be used to achieve the desired security level in any other MPEG-21 player or authoring tool. The architecture presented hereinafter has been adopted to enforce protection on authoring and player tools developed for the AXMEDIS IST FP6 R&D European Commission project
MPEG-21播放器和创作工具的设计与实现提出了几个需要解决的关键问题。其中最相关的是球员的安全级别和保护处理。针对MPEG-21数字内容,提出了在AXMEDIS工具中负责执行数字版权管理的组件的实现方案。所建议的体系结构提供了在客户端创建可信环境和对数字内容(包括数字资源及其组织和元数据)进行动态保护和不保护的功能。可以使用相同的解决方案在任何其他MPEG-21播放器或创作工具中实现所需的安全级别。下文提出的架构已被采用,以加强对为AXMEDIS IST FP6研发欧盟委员会项目开发的创作和播放器工具的保护
{"title":"A Protection Processor for MPEG-21 Players","authors":"P. Nesi, D. Rogai, A. Vallotti","doi":"10.1109/ICME.2006.262790","DOIUrl":"https://doi.org/10.1109/ICME.2006.262790","url":null,"abstract":"The design and implementation of MPEG-21 players and authoring tools presents several critical points to be solved. One of the most relevant is the security level and protection processing in the players. This paper presents a solution for the realization of components in charge of enforcing Digital Rights Management in AXMEDIS tools for MPEG-21 digital content. The proposed architecture provides functionalities to create both trusted environment on the client side and dynamic protection and unprotection of digital content including digital resources and their organization and metadata. The same solution can be used to achieve the desired security level in any other MPEG-21 player or authoring tool. The architecture presented hereinafter has been adopted to enforce protection on authoring and player tools developed for the AXMEDIS IST FP6 R&D European Commission project","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124705768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Media Synchronization Method for Video Hypermedia Application Based on Extended Event Model 基于扩展事件模型的视频超媒体应用媒体同步方法
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262774
Hironobu Abe, H. Shigeno, Ken-ichi Okada
This paper describes a proposal of an extended event model using a media synchronization method for video hypermedia applications. In this extended event model, video and metadata are synchronized by periodically inserting event information in the video multiplex. We considered the following design policies: 1) a model that is independent of the video format and delivery method, 2) the synchronization accuracy can be tuned depending on the purpose and use of the metadata. We designed the extended event model based on the above design policies, and implemented this model as an encode/decode library for Windows Media. Based on this model we developed a video hypermedia system prototype and performed evaluation experiments. The evaluation results of real time synchronization performance of the system prototype showed that in the case of sports video content a synchronization accuracy of 100 msec between video and metadata makes our method effective for use in video hypermedia applications
本文提出了一种基于媒体同步方法的扩展事件模型,用于视频超媒体应用。在该扩展事件模型中,通过在视频复用中周期性地插入事件信息来实现视频和元数据的同步。我们考虑了以下设计策略:1)独立于视频格式和交付方法的模型;2)可以根据元数据的目的和使用来调整同步精度。我们基于上述设计策略设计了扩展事件模型,并将该模型实现为Windows Media的编码/解码库。基于该模型,我们开发了一个视频超媒体系统原型并进行了评估实验。系统原型的实时同步性能评估结果表明,以体育视频内容为例,视频和元数据之间的同步精度达到100 msec,使我们的方法能够有效地用于视频超媒体应用
{"title":"Media Synchronization Method for Video Hypermedia Application Based on Extended Event Model","authors":"Hironobu Abe, H. Shigeno, Ken-ichi Okada","doi":"10.1109/ICME.2006.262774","DOIUrl":"https://doi.org/10.1109/ICME.2006.262774","url":null,"abstract":"This paper describes a proposal of an extended event model using a media synchronization method for video hypermedia applications. In this extended event model, video and metadata are synchronized by periodically inserting event information in the video multiplex. We considered the following design policies: 1) a model that is independent of the video format and delivery method, 2) the synchronization accuracy can be tuned depending on the purpose and use of the metadata. We designed the extended event model based on the above design policies, and implemented this model as an encode/decode library for Windows Media. Based on this model we developed a video hypermedia system prototype and performed evaluation experiments. The evaluation results of real time synchronization performance of the system prototype showed that in the case of sports video content a synchronization accuracy of 100 msec between video and metadata makes our method effective for use in video hypermedia applications","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129492504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Adaptive Dual AK-D Tree Search Algorithm for ICP Registration Applications ICP注册申请的自适应双AK-D树搜索算法
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262598
Jiann-Der Lee, Shih-Sen Hsieh, Chung-Hsien Huang, Li-Chang Liu, Cheien-Tsai Wu, Shin-Tseng Lee, Jyi-Feng Chen
An algorithm for finding coupling points plays an important role in the iterative closest point algorithm (ICP) which is widely used in registration applications in medical and 3-D architecture areas. In recent researches of finding coupling points, Approximate K-D tree search algorithm (AK-D tree) is an efficient nearest neighbor search algorithm with comparable results. We proposed adaptive dual AK-D tree search algorithm (ADAK-D tree) for searching and synthesizing coupling points as significant control points to improve the registration accuracy in ICP registration applications. ADAK-D tree utilizes AK-D tree twice in different geometrical projection orders to reserve true nearest neighbor points used in later ICP stages. An adaptive threshold in ADAK-D tree is used to reserve sufficient coupling points for a smaller alignment error. Experimental results are shown that the registration accuracy of using ADAK-D tree is improved more than the result of using AK-D tree and the computation time is acceptable
迭代最近点算法(ICP)广泛应用于医疗和三维建筑领域的配准中,耦合点的寻找算法在其中起着重要的作用。在最近的寻找耦合点的研究中,近似K-D树搜索算法(Approximate K-D tree search algorithm, AK-D tree)是一种效率高、结果可比较的最近邻搜索算法。为了提高ICP配准应用中的配准精度,提出了自适应双AK-D树搜索算法(ADAK-D树),用于搜索和合成耦合点作为重要控制点。ADAK-D树以不同的几何投影顺序两次利用AK-D树来保留后期ICP阶段使用的真正最近邻点。在ADAK-D树中采用自适应阈值,为较小的对准误差保留足够的耦合点。实验结果表明,使用ADAK-D树的配准精度比使用AK-D树的配准精度有较大提高,且计算时间可以接受
{"title":"Adaptive Dual AK-D Tree Search Algorithm for ICP Registration Applications","authors":"Jiann-Der Lee, Shih-Sen Hsieh, Chung-Hsien Huang, Li-Chang Liu, Cheien-Tsai Wu, Shin-Tseng Lee, Jyi-Feng Chen","doi":"10.1109/ICME.2006.262598","DOIUrl":"https://doi.org/10.1109/ICME.2006.262598","url":null,"abstract":"An algorithm for finding coupling points plays an important role in the iterative closest point algorithm (ICP) which is widely used in registration applications in medical and 3-D architecture areas. In recent researches of finding coupling points, Approximate K-D tree search algorithm (AK-D tree) is an efficient nearest neighbor search algorithm with comparable results. We proposed adaptive dual AK-D tree search algorithm (ADAK-D tree) for searching and synthesizing coupling points as significant control points to improve the registration accuracy in ICP registration applications. ADAK-D tree utilizes AK-D tree twice in different geometrical projection orders to reserve true nearest neighbor points used in later ICP stages. An adaptive threshold in ADAK-D tree is used to reserve sufficient coupling points for a smaller alignment error. Experimental results are shown that the registration accuracy of using ADAK-D tree is improved more than the result of using AK-D tree and the computation time is acceptable","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129865263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Audiovisual Anchorperson Detection for Topic-Oriented Navigation in Broadcast News 面向广播新闻主题导航的视听主播检测
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262906
M. Haller, Hyoung‐Gook Kim, T. Sikora
This paper presents a content-based audiovisual video analysis technique for anchorperson detection in broadcast news. For topic-oriented navigation in newscasts, a segmentation of the topic boundaries is needed. As the anchorperson gives a strong indication for such boundaries, the presented technique automatically determines that high-level information for video indexing from MPEG-2 videos and stores the results in an MPEG-7 conform format. The multimodal analysis process is carried out separately in the auditory and visual modality, and the decision fusion forms the final anchorperson segments
提出了一种基于内容的广播新闻主播检测的视听视频分析技术。对于新闻广播中面向主题的导航,需要对主题边界进行分割。由于主播对这些边界给出了强烈的指示,因此所提出的技术自动确定用于从MPEG-2视频中索引视频的高级信息,并将结果存储为符合MPEG-7的格式。多模态分析过程分别在听觉和视觉模态上进行,决策融合形成最终的主播段
{"title":"Audiovisual Anchorperson Detection for Topic-Oriented Navigation in Broadcast News","authors":"M. Haller, Hyoung‐Gook Kim, T. Sikora","doi":"10.1109/ICME.2006.262906","DOIUrl":"https://doi.org/10.1109/ICME.2006.262906","url":null,"abstract":"This paper presents a content-based audiovisual video analysis technique for anchorperson detection in broadcast news. For topic-oriented navigation in newscasts, a segmentation of the topic boundaries is needed. As the anchorperson gives a strong indication for such boundaries, the presented technique automatically determines that high-level information for video indexing from MPEG-2 videos and stores the results in an MPEG-7 conform format. The multimodal analysis process is carried out separately in the auditory and visual modality, and the decision fusion forms the final anchorperson segments","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128581438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Experiential Sampling based Foreground/Background Segmentation for Video Surveillance 基于经验采样的视频监控前景/背景分割
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262904
P. Atrey, Vinay Kumar, Anurag Kumar, M. Kankanhalli
Segmentation of foreground and background has been an important research problem arising out of many applications including video surveillance. A method commonly used for segmentation is "background subtraction" or thresholding the difference between the estimated background image and current image. Adaptive Gaussian mixture based background modelling has been proposed by many researchers for increasing the robustness against environmental changes. However, all these methods, being computationally intensive, need to be optimized for efficient and real-time performance especially at a higher image resolution. In this paper, we propose an improved foreground/background segmentation method which uses experiential sampling technique to restrict the computational efforts in the region of interest. We exploit the fact that the region of interest in general is present only in a small part of the image, therefore, the attention should only be focused in those regions. The proposed method shows a significant gain in processing speed at the expense of minor loss in accuracy. We provide experimental results and detailed analysis to show the utility of our method
前景和背景分割一直是包括视频监控在内的许多应用中出现的重要研究问题。一种常用的分割方法是“背景减法”或对估计的背景图像和当前图像之间的差值进行阈值化。为了提高系统对环境变化的鲁棒性,许多研究人员提出了基于自适应高斯混合背景建模的方法。然而,所有这些方法都是计算密集型的,需要优化以获得高效和实时的性能,特别是在更高的图像分辨率下。本文提出了一种改进的前景/背景分割方法,该方法利用经验采样技术将计算量限制在感兴趣区域。我们利用这样一个事实,即感兴趣的区域通常只存在于图像的一小部分,因此,注意力应该只集中在这些区域。所提出的方法在处理速度上有显著的提高,但在精度上有较小的损失。我们提供了实验结果和详细的分析来证明我们的方法的实用性
{"title":"Experiential Sampling based Foreground/Background Segmentation for Video Surveillance","authors":"P. Atrey, Vinay Kumar, Anurag Kumar, M. Kankanhalli","doi":"10.1109/ICME.2006.262904","DOIUrl":"https://doi.org/10.1109/ICME.2006.262904","url":null,"abstract":"Segmentation of foreground and background has been an important research problem arising out of many applications including video surveillance. A method commonly used for segmentation is \"background subtraction\" or thresholding the difference between the estimated background image and current image. Adaptive Gaussian mixture based background modelling has been proposed by many researchers for increasing the robustness against environmental changes. However, all these methods, being computationally intensive, need to be optimized for efficient and real-time performance especially at a higher image resolution. In this paper, we propose an improved foreground/background segmentation method which uses experiential sampling technique to restrict the computational efforts in the region of interest. We exploit the fact that the region of interest in general is present only in a small part of the image, therefore, the attention should only be focused in those regions. The proposed method shows a significant gain in processing speed at the expense of minor loss in accuracy. We provide experimental results and detailed analysis to show the utility of our method","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128642314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Interactions and Integrations of Multiple Sensory Channels in Human Brain 人脑多感觉通道的相互作用与整合
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262437
S. Nishida
This paper describes a couple of new principles with regard to interactions and integrations of multiple sensory channels in the human brain. First, as opposed to the general belief that the perception of shape and that of color are relatively independent of motion processing, human visual system integrates shape and color signals along perceived motion trajectory in order to improve visibility of shape and color of moving objects. Second, when the human sensory system binds the outputs of different sensory channels, (including audio-visual signals) based on their temporal synchrony, it uses only sparse salient features rather than using the time courses of full sensory signals. We believe these principles are potentially useful for development of effective audiovisual processing and presentation devices
本文介绍了关于人脑中多个感觉通道相互作用和整合的几个新原理。首先,与一般认为的形状和颜色的感知相对独立于运动处理不同,人类视觉系统将形状和颜色信号沿着感知到的运动轨迹进行整合,以提高运动物体的形状和颜色的可见性。其次,当人类感觉系统基于时间同步性绑定不同感觉通道(包括视听信号)的输出时,它只使用稀疏的显著特征,而不是使用完整感觉信号的时间过程。我们相信这些原理对开发有效的视听处理和呈现设备有潜在的用处
{"title":"Interactions and Integrations of Multiple Sensory Channels in Human Brain","authors":"S. Nishida","doi":"10.1109/ICME.2006.262437","DOIUrl":"https://doi.org/10.1109/ICME.2006.262437","url":null,"abstract":"This paper describes a couple of new principles with regard to interactions and integrations of multiple sensory channels in the human brain. First, as opposed to the general belief that the perception of shape and that of color are relatively independent of motion processing, human visual system integrates shape and color signals along perceived motion trajectory in order to improve visibility of shape and color of moving objects. Second, when the human sensory system binds the outputs of different sensory channels, (including audio-visual signals) based on their temporal synchrony, it uses only sparse salient features rather than using the time courses of full sensory signals. We believe these principles are potentially useful for development of effective audiovisual processing and presentation devices","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129376748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Reuse of Motion Processing for Camera Stabilization and Video Coding 运动处理在摄像机稳定和视频编码中的重用
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262479
Bao Lei, R. K. Gunnewiek, P. D. With
The low bit rate of existing video encoders relies heavily on the accuracy of estimating actual motion in the input video sequence. In this paper, we propose a video stabilization and encoding (ViSE) system to achieve a higher coding efficiency through a preceding motion processing stage (to the compression), of which the stabilization part should compensate for vibrating camera motion. The improved motion prediction is obtained by differentiating between the temporal coherent motion and a more noisy motion component which is orthogonal to the coherent one. The system compensates the latter undesirable motion, so that it is eliminated prior to video encoding. To reduce the computational complexity of integrating a digital stabilization algorithm with video encoding, we propose a system that reuses the already evaluated motion vector from the stabilization stage in the compression. As compared to H.264, our system shows a 14% reduction in bit rate yet obtaining an increase of about 0.5 dB in SNR
现有视频编码器的低比特率在很大程度上依赖于对输入视频序列中实际运动的估计精度。本文提出了一种视频防抖编码(ViSE)系统,通过前一个运动处理阶段(到压缩)来实现更高的编码效率,其中防抖部分应该补偿摄像机的振动运动。通过区分时间相干运动和与相干运动正交的噪声较大的运动分量,得到了改进的运动预测。系统补偿后一种不希望的运动,使其在视频编码之前被消除。为了减少将数字稳定算法与视频编码集成的计算复杂性,我们提出了一种系统,该系统重用压缩中稳定阶段已经评估的运动矢量。与H.264相比,我们的系统显示比特率降低了14%,但信噪比提高了约0.5 dB
{"title":"Reuse of Motion Processing for Camera Stabilization and Video Coding","authors":"Bao Lei, R. K. Gunnewiek, P. D. With","doi":"10.1109/ICME.2006.262479","DOIUrl":"https://doi.org/10.1109/ICME.2006.262479","url":null,"abstract":"The low bit rate of existing video encoders relies heavily on the accuracy of estimating actual motion in the input video sequence. In this paper, we propose a video stabilization and encoding (ViSE) system to achieve a higher coding efficiency through a preceding motion processing stage (to the compression), of which the stabilization part should compensate for vibrating camera motion. The improved motion prediction is obtained by differentiating between the temporal coherent motion and a more noisy motion component which is orthogonal to the coherent one. The system compensates the latter undesirable motion, so that it is eliminated prior to video encoding. To reduce the computational complexity of integrating a digital stabilization algorithm with video encoding, we propose a system that reuses the already evaluated motion vector from the stabilization stage in the compression. As compared to H.264, our system shows a 14% reduction in bit rate yet obtaining an increase of about 0.5 dB in SNR","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127197875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Efficient Memory Construction Scheme for an Arbitrary Side Growing Huffman Table 任意边生长哈夫曼表的一种高效内存构造方案
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262589
Sung-Wen Wang, Shang-Chih Chuang, Chih-Chieh Hsiao, Yi-Shin Tung, Ja-Ling Wu
By grouping the common prefix of a Huffman tree, in stead of the commonly used single-side rowing Huffman tree (SGH-tree), we construct a memory efficient Huffman table on the basis of an arbitrary-side growing Huffman tree (AGH-tree) to speed up the Huffman decoding. Simulation results show that, in Huffman decoding, an AGH-tree based Huffman table is 2.35 times faster that of the Hashemian's method (an SGH-tree based one) and needs only one-fifth the corresponding memory size. In summary, a novel Huffman table construction scheme is proposed in this paper which provides better performance than existing construction schemes in both decoding speed and memory usage
通过对霍夫曼树的公共前缀进行分组,取代了常用的单侧划船霍夫曼树(SGH-tree),在任意侧生长霍夫曼树(AGH-tree)的基础上构造了一个内存高效的霍夫曼表,以加快霍夫曼解码的速度。仿真结果表明,在Huffman解码中,基于agh树的Huffman表比基于sgh树的Hashemian方法快2.35倍,并且只需要相应内存大小的五分之一。综上所述,本文提出了一种新的霍夫曼表构造方案,该方案在解码速度和内存利用率方面都优于现有的构造方案
{"title":"An Efficient Memory Construction Scheme for an Arbitrary Side Growing Huffman Table","authors":"Sung-Wen Wang, Shang-Chih Chuang, Chih-Chieh Hsiao, Yi-Shin Tung, Ja-Ling Wu","doi":"10.1109/ICME.2006.262589","DOIUrl":"https://doi.org/10.1109/ICME.2006.262589","url":null,"abstract":"By grouping the common prefix of a Huffman tree, in stead of the commonly used single-side rowing Huffman tree (SGH-tree), we construct a memory efficient Huffman table on the basis of an arbitrary-side growing Huffman tree (AGH-tree) to speed up the Huffman decoding. Simulation results show that, in Huffman decoding, an AGH-tree based Huffman table is 2.35 times faster that of the Hashemian's method (an SGH-tree based one) and needs only one-fifth the corresponding memory size. In summary, a novel Huffman table construction scheme is proposed in this paper which provides better performance than existing construction schemes in both decoding speed and memory usage","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127303039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2006 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1