首页 > 最新文献

2008 IEEE 10th Workshop on Multimedia Signal Processing最新文献

英文 中文
Macroblock-based adaptive interpolation filter method using new filter selection in H.264/AVC H.264/AVC中基于宏块的自适应插值滤波方法
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665113
K. Yoon, J. H. Kim
The macroblock (MB)-based adaptive interpolation filter method has been considered to be able to achieve high coding efficiency in H.264/AVC. Although the conventional cost functions have showed a good performance in terms of rate and distortion, it still leaves room for improvement. To improve coding efficiency, we introduce a new cost function which considers two bit rates, motion vector and prediction error, and reconstruction error of MB. The filter which minimizes the proposed cost function is adaptively selected per MB. Experimental results show that the adaptive interpolation filter with the proposed cost function significantly improves the coding efficiency compared to ones using conventional cost function. It leads to about a 5.19% (1 reference frame) and 5.14% (5 reference frames) bit rate reduction on average compared to H.264/AVC, respectively.
基于宏块(macroblock, MB)的自适应插值滤波方法被认为能够在H.264/AVC中实现较高的编码效率。虽然传统的成本函数在速率和失真方面表现出了良好的性能,但仍有改进的空间。为了提高编码效率,我们引入了一个新的代价函数,该函数考虑了两个比特率、运动矢量和预测误差以及MB的重建误差,并自适应地选择了每MB最小代价函数的滤波器。实验结果表明,与使用传统代价函数的自适应插值滤波器相比,使用所提出的代价函数的自适应插值滤波器显著提高了编码效率。与H.264/AVC相比,平均比特率分别降低了5.19%(1个参考帧)和5.14%(5个参考帧)。
{"title":"Macroblock-based adaptive interpolation filter method using new filter selection in H.264/AVC","authors":"K. Yoon, J. H. Kim","doi":"10.1109/MMSP.2008.4665113","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665113","url":null,"abstract":"The macroblock (MB)-based adaptive interpolation filter method has been considered to be able to achieve high coding efficiency in H.264/AVC. Although the conventional cost functions have showed a good performance in terms of rate and distortion, it still leaves room for improvement. To improve coding efficiency, we introduce a new cost function which considers two bit rates, motion vector and prediction error, and reconstruction error of MB. The filter which minimizes the proposed cost function is adaptively selected per MB. Experimental results show that the adaptive interpolation filter with the proposed cost function significantly improves the coding efficiency compared to ones using conventional cost function. It leads to about a 5.19% (1 reference frame) and 5.14% (5 reference frames) bit rate reduction on average compared to H.264/AVC, respectively.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134345340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When multimedia advertising meets the new Internet era 当多媒体广告遇到新的互联网时代
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665039
Xiansheng Hua, Tao Mei, Shipeng Li
The advent of media-sharing sites, especially along with the so called Web 2.0 wave, has led to the unprecedented Internet delivery of community-contributed media contents such as images and videos, which have become the primary sources for online advertising. However, conventional ad-networks such as Google Adwords and AdSense treat image and video advertising as general text advertising by displaying the ads either relevant to the queries or the Web page content, without considering automatically monetizing the rich contents of individual images and videos. In this paper, we summarize the trends of online advertising and propose an innovative advertising model driven by the compelling contents of images and videos. We present recently developed ImageSense and VideoSense as two exemplary applications dedicated to images and videos, respectively, in which the most contextually relevant ads are embedded at the most appropriate positions within the images or videos. The ads are selected based on not only textual relevance but also visual similarity so that the ads yield contextual relevance to both the text in the Web page and the visual content. The ad insertion positions are detected based on visual saliency analysis to minimize the intrusiveness to the user. We also envision that the next trend of multimedia advertising would be game-alike advertising.
媒体共享网站的出现,特别是伴随着所谓的Web 2.0浪潮,导致了前所未有的社区贡献媒体内容(如图像和视频)的互联网交付,这些内容已成为在线广告的主要来源。然而,传统的广告网络,如Google Adwords和AdSense,通过显示与查询或网页内容相关的广告,将图像和视频广告视为一般的文本广告,而不考虑自动将单个图像和视频的丰富内容货币化。在本文中,我们总结了在线广告的趋势,并提出了一种由引人注目的图像和视频内容驱动的创新广告模式。我们介绍了最近开发的ImageSense和VideoSense,分别作为图像和视频的两个示例应用程序,其中最相关的广告嵌入在图像或视频中最合适的位置。广告的选择不仅基于文本相关性,还基于视觉相似性,以便广告产生与网页中的文本和视觉内容的上下文相关性。基于视觉显著性分析检测广告插入位置,以尽量减少对用户的侵入性。我们还设想,多媒体广告的下一个趋势将是游戏式广告。
{"title":"When multimedia advertising meets the new Internet era","authors":"Xiansheng Hua, Tao Mei, Shipeng Li","doi":"10.1109/MMSP.2008.4665039","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665039","url":null,"abstract":"The advent of media-sharing sites, especially along with the so called Web 2.0 wave, has led to the unprecedented Internet delivery of community-contributed media contents such as images and videos, which have become the primary sources for online advertising. However, conventional ad-networks such as Google Adwords and AdSense treat image and video advertising as general text advertising by displaying the ads either relevant to the queries or the Web page content, without considering automatically monetizing the rich contents of individual images and videos. In this paper, we summarize the trends of online advertising and propose an innovative advertising model driven by the compelling contents of images and videos. We present recently developed ImageSense and VideoSense as two exemplary applications dedicated to images and videos, respectively, in which the most contextually relevant ads are embedded at the most appropriate positions within the images or videos. The ads are selected based on not only textual relevance but also visual similarity so that the ads yield contextual relevance to both the text in the Web page and the visual content. The ad insertion positions are detected based on visual saliency analysis to minimize the intrusiveness to the user. We also envision that the next trend of multimedia advertising would be game-alike advertising.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124658042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Comparison of different feature extraction techniques in content-based image retrieval for CT brain images 基于内容的CT脑图像检索中不同特征提取技术的比较
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665130
Wan Siti Halimatul Munirah Wan Ahmad, M. F. A. Fauzi
Content-based image retrieval (CBIR) system helps users retrieve relevant images based on their contents. A reliable content-based feature extraction technique is therefore required to effectively extract most of the information from the images. These important elements include texture, colour, intensity or shape of the object inside an image. CBIR, when used in medical applications, can help medical experts in their diagnosis such as retrieving similar kind of disease and patientpsilas progress monitoring. In this paper, several feature extraction techniques are explored to see their effectiveness in retrieving medical images. The techniques are Gabor transform, discrete wavelet frame, Hu moment invariants, Fourier descriptor, gray level histogram and gray level coherence vector. Experiments are conducted on 3,032 CT images of human brain and promising results are reported.
基于内容的图像检索(CBIR)系统可以使用户根据图像的内容检索相关图像。因此,需要一种可靠的基于内容的特征提取技术来有效地提取图像中的大部分信息。这些重要的元素包括图像中物体的纹理、颜色、强度或形状。在医学应用中,CBIR可以帮助医学专家进行诊断,例如检索类似的疾病和监测患者的病情进展。本文探讨了几种特征提取技术在医学图像检索中的有效性。这些技术包括Gabor变换、离散小波帧、Hu矩不变量、傅里叶描述子、灰度直方图和灰度相干向量。对3032张人脑CT图像进行了实验,并取得了可喜的结果。
{"title":"Comparison of different feature extraction techniques in content-based image retrieval for CT brain images","authors":"Wan Siti Halimatul Munirah Wan Ahmad, M. F. A. Fauzi","doi":"10.1109/MMSP.2008.4665130","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665130","url":null,"abstract":"Content-based image retrieval (CBIR) system helps users retrieve relevant images based on their contents. A reliable content-based feature extraction technique is therefore required to effectively extract most of the information from the images. These important elements include texture, colour, intensity or shape of the object inside an image. CBIR, when used in medical applications, can help medical experts in their diagnosis such as retrieving similar kind of disease and patientpsilas progress monitoring. In this paper, several feature extraction techniques are explored to see their effectiveness in retrieving medical images. The techniques are Gabor transform, discrete wavelet frame, Hu moment invariants, Fourier descriptor, gray level histogram and gray level coherence vector. Experiments are conducted on 3,032 CT images of human brain and promising results are reported.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125749014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
On the systematic generation of Tardos’s fingerprinting codes 论塔尔多斯指纹码的系统生成
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665174
M. Kuribayashi, N. Akashi, M. Morii
Digital fingerprinting is used to trace back illegal users, where unique ID known as digital fingerprints is embedded into a content before distribution. On the generation of such fingerprints, one of the important properties is collusion-resistance. Binary codes for fingerprinting with a code length of theoretically minimum order were proposed by Tardos, and the related works mainly focused on the reduction of the code length were presented. In this paper, we present a concrete and systematic construction of the Tardospsilas fingerprinting code using a chaotic map. Using a statistical model for correlation scores, a proper threshold for detecting colluders is calculated. Furthermore, for the reduction of computational costs required for the detection, a hierarchical structure is introduced on the codewords. The collusion-resistance of the generated fingerprinting codes is evaluated by a computer simulation.
数字指纹技术是在内容发布前嵌入被称为数字指纹的唯一ID,用于追踪非法用户的技术。在生成此类指纹时,一个重要的特性是抗合谋性。Tardos提出了一种码长为理论最小阶的指纹二进制码,并对码长缩减进行了研究。在本文中,我们提出了一个具体的、系统的基于混沌映射的Tardospsilas指纹编码结构。利用相关分数的统计模型,计算出检测共谋的合适阈值。此外,为了减少检测所需的计算成本,在码字上引入了层次结构。通过计算机仿真,对所生成的指纹码的抗串通性能进行了评价。
{"title":"On the systematic generation of Tardos’s fingerprinting codes","authors":"M. Kuribayashi, N. Akashi, M. Morii","doi":"10.1109/MMSP.2008.4665174","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665174","url":null,"abstract":"Digital fingerprinting is used to trace back illegal users, where unique ID known as digital fingerprints is embedded into a content before distribution. On the generation of such fingerprints, one of the important properties is collusion-resistance. Binary codes for fingerprinting with a code length of theoretically minimum order were proposed by Tardos, and the related works mainly focused on the reduction of the code length were presented. In this paper, we present a concrete and systematic construction of the Tardospsilas fingerprinting code using a chaotic map. Using a statistical model for correlation scores, a proper threshold for detecting colluders is calculated. Furthermore, for the reduction of computational costs required for the detection, a hierarchical structure is introduced on the codewords. The collusion-resistance of the generated fingerprinting codes is evaluated by a computer simulation.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125064244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Low-complexity frame importance modelling and resource allocation scheme for error-resilience H.264 video streaming H.264视频流的低复杂度帧重要性建模和资源分配方案
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665185
Gang Sun, Wei Xing, Dongming Lu
In this paper, we addressed the problem of redundancy allocation for protecting packet loss for better quality of service (QoS) in real-time H.264 video streaming. A novel error-resilient approach is proposed for the transmission of pre-encoded H.264 video stream under bandwidth constrained networks. A novel frame importance model is derived for estimating relative importance index for different H.264 video frames. Combining with the characteristics of the network, the optimal resource allocation strategy for different video frames can be determined for achieving improved error resilience. The model uses frame error propagation index (FEPI) to characterize video quality degradation caused by error propagation in different frames in a GOP when suffer from packet loss. This model can be calculated in DCT domain with the parameters extracted directly from the bitstream. Therefore, the complexity of the proposed scheme is very low and much better for real-time video transmission. Simulation results show that the proposed scheme can improve the receiver side reconstructed video quality remarkably under different channel loss patterns.
在本文中,我们解决了在实时H.264视频流中保护丢包以获得更好的服务质量(QoS)的冗余分配问题。针对带宽受限的网络中预编码H.264视频流的传输,提出了一种新的容错方法。提出了一种新的帧重要性模型,用于估计不同H.264视频帧的相对重要性指数。结合网络的特点,确定不同视频帧的最优资源分配策略,以达到提高容错性的目的。该模型使用帧错误传播指数(FEPI)来表征丢包时,在GOP中不同帧中错误传播引起的视频质量下降。该模型可以直接从比特流中提取参数,在DCT域中进行计算。因此,该方案的复杂度很低,更适合实时视频传输。仿真结果表明,在不同信道损耗模式下,该方案能显著提高接收端重构视频的质量。
{"title":"Low-complexity frame importance modelling and resource allocation scheme for error-resilience H.264 video streaming","authors":"Gang Sun, Wei Xing, Dongming Lu","doi":"10.1109/MMSP.2008.4665185","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665185","url":null,"abstract":"In this paper, we addressed the problem of redundancy allocation for protecting packet loss for better quality of service (QoS) in real-time H.264 video streaming. A novel error-resilient approach is proposed for the transmission of pre-encoded H.264 video stream under bandwidth constrained networks. A novel frame importance model is derived for estimating relative importance index for different H.264 video frames. Combining with the characteristics of the network, the optimal resource allocation strategy for different video frames can be determined for achieving improved error resilience. The model uses frame error propagation index (FEPI) to characterize video quality degradation caused by error propagation in different frames in a GOP when suffer from packet loss. This model can be calculated in DCT domain with the parameters extracted directly from the bitstream. Therefore, the complexity of the proposed scheme is very low and much better for real-time video transmission. Simulation results show that the proposed scheme can improve the receiver side reconstructed video quality remarkably under different channel loss patterns.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130899045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Segmentation of characters on car license plates 车牌字符分割
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665111
Xiangjian He, Lihong Zheng, Qiang Wu, W. Jia, B. Samali, M. Palaniswami
License plate recognition usually contains three steps, namely license plate detection/localization, character segmentation and character recognition. When reading characters on a license plate one by one after license plate detection step, it is crucial to accurately segment the characters. The segmentation step may be affected by many factors such as license plate boundaries (frames). The recognition accuracy will be significantly reduced if the characters are not properly segmented. This paper presents an efficient algorithm for character segmentation on a license plate. The algorithm follows the step that detects the license plates using an AdaBoost algorithm. It is based on an efficient and accurate skew and slant correction of license plates, and works together with boundary (frame) removal of license plates. The algorithm is efficient and can be applied in real-time applications. The experiments are performed to show the accuracy of segmentation.
车牌识别通常包含三个步骤,即车牌检测/定位、字符分割和字符识别。在车牌检测步骤后逐个读取车牌上的字符时,准确分割字符是至关重要的。分割步骤可能受到车牌边界(帧)等诸多因素的影响。如果字符分割不当,识别精度将大大降低。提出了一种高效的车牌字符分割算法。该算法遵循使用AdaBoost算法检测车牌的步骤。该方法基于对车牌进行高效、准确的斜、斜校正,并与车牌的边界(框)去除协同工作。该算法效率高,可用于实时应用。通过实验验证了分割的准确性。
{"title":"Segmentation of characters on car license plates","authors":"Xiangjian He, Lihong Zheng, Qiang Wu, W. Jia, B. Samali, M. Palaniswami","doi":"10.1109/MMSP.2008.4665111","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665111","url":null,"abstract":"License plate recognition usually contains three steps, namely license plate detection/localization, character segmentation and character recognition. When reading characters on a license plate one by one after license plate detection step, it is crucial to accurately segment the characters. The segmentation step may be affected by many factors such as license plate boundaries (frames). The recognition accuracy will be significantly reduced if the characters are not properly segmented. This paper presents an efficient algorithm for character segmentation on a license plate. The algorithm follows the step that detects the license plates using an AdaBoost algorithm. It is based on an efficient and accurate skew and slant correction of license plates, and works together with boundary (frame) removal of license plates. The algorithm is efficient and can be applied in real-time applications. The experiments are performed to show the accuracy of segmentation.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129896721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Standard-compliant multiple description image coding by spatial multiplexing and constrained least-squares restoration 基于空间复用和约束最小二乘恢复的符合标准的多重描述图像编码
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665102
Xiangjun Zhang, Xiaolin Wu
We propose a practical standard-compliant multiple description (MD) image coding technique. Multiple descriptions of an image are generated in the spatial domain by an adaptive prefiltering and uniform down sampling process. The resulting side descriptions are conventional square sample grids that are interleaved with one the other. As such each side description can be coded by any of the existing image compression standards. A side decoder reconstructs the input image by first decompressing the down-sampled image and then solving a least-squares inverse problem, guided by a two-dimensional windowed piecewise autoregressive model. The central decoder is algorithmically similar to the side decoder, but it improves the reconstruction quality by using received side descriptions as additional constraints when solving the underlying inverse problem. Compared with its predecessors the proposed image MD technique offers the lowest encoder complexity, complete standard compliance, competitive rate-distortion performance, and superior subjective quality.
我们提出了一种实用的符合标准的多重描述(MD)图像编码技术。通过自适应预滤波和均匀下采样过程,在空间域中生成图像的多个描述。所得到的边描述是彼此交错的常规方形样本网格。因此,可以用任何现有的图像压缩标准对每个侧描述进行编码。侧解码器通过首先对下采样图像进行解压缩,然后在二维加窗分段自回归模型的指导下求解最小二乘反问题来重建输入图像。中心解码器在算法上类似于侧解码器,但它通过在求解底层逆问题时使用接收侧描述作为附加约束来提高重构质量。与先前的图像MD技术相比,所提出的图像MD技术具有最低的编码器复杂性,完全符合标准,具有竞争力的率失真性能和优越的主观质量。
{"title":"Standard-compliant multiple description image coding by spatial multiplexing and constrained least-squares restoration","authors":"Xiangjun Zhang, Xiaolin Wu","doi":"10.1109/MMSP.2008.4665102","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665102","url":null,"abstract":"We propose a practical standard-compliant multiple description (MD) image coding technique. Multiple descriptions of an image are generated in the spatial domain by an adaptive prefiltering and uniform down sampling process. The resulting side descriptions are conventional square sample grids that are interleaved with one the other. As such each side description can be coded by any of the existing image compression standards. A side decoder reconstructs the input image by first decompressing the down-sampled image and then solving a least-squares inverse problem, guided by a two-dimensional windowed piecewise autoregressive model. The central decoder is algorithmically similar to the side decoder, but it improves the reconstruction quality by using received side descriptions as additional constraints when solving the underlying inverse problem. Compared with its predecessors the proposed image MD technique offers the lowest encoder complexity, complete standard compliance, competitive rate-distortion performance, and superior subjective quality.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127208499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Motion modeling with separate quad-tree structures for geometry and motion 运动建模与独立的四叉树结构的几何和运动
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665106
R. Mathew, D. Taubman
Quad-tree structures are often used to model motion between frames of a video sequence. However, a fundamental limitation of the quad-tree structure is that it can only capture horizontal and vertical edge discontinuities at dyadically related locations. To address this limitation recent work has focused on the introduction of geometry information to nodes of tree structured motion representations. In this paper we explore modeling boundary geometry and motion with separate quadtree structures. Recent work into quad-tree representations have also highlighted the benefits of leaf merging. We extend the leaf merging paradigm to incorporate separate tree structures for boundary geometry and motion. To achieve an efficient joint representation we introduce polynomial motion models and piecewise linear boundary geometry to our quad-tree structures. Experimental results show that the approach taken in this paper provides significant improvement over previous quad-tree based motion representation schemes.
四叉树结构通常用于模拟视频序列帧之间的运动。然而,四叉树结构的一个基本限制是,它只能捕获横向相关位置的水平和垂直边缘不连续。为了解决这一限制,最近的工作集中在将几何信息引入树形结构运动表示的节点上。在本文中,我们探索用单独的四叉树结构建模边界几何和运动。最近对四叉树表示的研究也强调了叶合并的好处。我们扩展了叶子合并范例,将分离的树结构用于边界几何和运动。为了实现有效的联合表示,我们在四叉树结构中引入了多项式运动模型和分段线性边界几何。实验结果表明,本文所采用的方法比以往基于四叉树的运动表示方法有了显著的改进。
{"title":"Motion modeling with separate quad-tree structures for geometry and motion","authors":"R. Mathew, D. Taubman","doi":"10.1109/MMSP.2008.4665106","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665106","url":null,"abstract":"Quad-tree structures are often used to model motion between frames of a video sequence. However, a fundamental limitation of the quad-tree structure is that it can only capture horizontal and vertical edge discontinuities at dyadically related locations. To address this limitation recent work has focused on the introduction of geometry information to nodes of tree structured motion representations. In this paper we explore modeling boundary geometry and motion with separate quadtree structures. Recent work into quad-tree representations have also highlighted the benefits of leaf merging. We extend the leaf merging paradigm to incorporate separate tree structures for boundary geometry and motion. To achieve an efficient joint representation we introduce polynomial motion models and piecewise linear boundary geometry to our quad-tree structures. Experimental results show that the approach taken in this paper provides significant improvement over previous quad-tree based motion representation schemes.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126796832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Image registration by means of 3D octree correlation 基于三维八叉树相关的图像配准
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665132
C. Ruwwe, B. Keck, Oliver Rusch, U. Zölzer, Xavier Loison
With no calibrated camera setup at hand, careful inspection of the imagery is needed to guarantee a feasible 3D reconstruction result based upon the images. We propose a new approach for image registration based on reconstructed 3D octrees by voxel carving. Correlation of these models gives rise to a translation offset for a maximum intersection between different models from different images. Projecting the resulting three-dimensional translation offsets back into the image plane results in two two-dimensional image offsets that are used for the image registration.
由于手头没有校准的相机设置,需要仔细检查图像,以保证基于图像的可行3D重建结果。提出了一种基于体素雕刻重建的三维八叉树图像配准的新方法。这些模型的相关性使得不同图像的不同模型之间的最大交集产生平移偏移。将所得到的三维平移偏移量投影回所述图像平面,得到用于所述图像配准的两个二维图像偏移量。
{"title":"Image registration by means of 3D octree correlation","authors":"C. Ruwwe, B. Keck, Oliver Rusch, U. Zölzer, Xavier Loison","doi":"10.1109/MMSP.2008.4665132","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665132","url":null,"abstract":"With no calibrated camera setup at hand, careful inspection of the imagery is needed to guarantee a feasible 3D reconstruction result based upon the images. We propose a new approach for image registration based on reconstructed 3D octrees by voxel carving. Correlation of these models gives rise to a translation offset for a maximum intersection between different models from different images. Projecting the resulting three-dimensional translation offsets back into the image plane results in two two-dimensional image offsets that are used for the image registration.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127580306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
2-D dual multiresolution decomposition through NUDFB and its application 基于NUDFB的二维双多分辨率分解及其应用
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665131
Nannan Ma, H. Xiong, Li Song
This paper aims to attain sparser representation of a 2-D signal by introducing orientation resolution as a second multiresolution besides multiscale, which is formulated to achieve a dual multiresolution decomposition framework by nonuniform directional frequency decompositions (NUDFB) under arbitrary scales. In this scheme, NUDFB is fulfilled by changing the topology structure of a non-symmetric binary tree (NSBT). Through this nonuniform division, we can get arbitrary orientation resolution r at a direction of c2-r under a target scale. Every two-channel filter bank on each node of this NSBT is designed to be a paraunitary perfect reconstruction filter bank, so NUDFB is an orthogonal filter bank. This dual multiresolution decomposition will definitely have bright prospect in its application, such as texture analysis, image processing or video coding. A potential application is presented by applying NUDFB in wavelet domain.
本文旨在通过引入定向分辨率作为多尺度之外的第二多分辨率来获得二维信号的稀疏表示,该多分辨率是为了在任意尺度下通过非均匀定向频率分解(NUDFB)实现双多分辨率分解框架而制定的。在该方案中,通过改变非对称二叉树(NSBT)的拓扑结构来实现NUDFB。通过这种非均匀划分,我们可以在目标尺度下,在c2-r方向上得到任意方向分辨率r。该NSBT的每个节点上的每个双通道滤波器组都被设计为一个准酉完美重构滤波器组,因此NUDFB是一个正交滤波器组。这种双多分辨率分解方法在纹理分析、图像处理或视频编码等方面具有广阔的应用前景。提出了在小波域应用NUDFB的一种潜在应用。
{"title":"2-D dual multiresolution decomposition through NUDFB and its application","authors":"Nannan Ma, H. Xiong, Li Song","doi":"10.1109/MMSP.2008.4665131","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665131","url":null,"abstract":"This paper aims to attain sparser representation of a 2-D signal by introducing orientation resolution as a second multiresolution besides multiscale, which is formulated to achieve a dual multiresolution decomposition framework by nonuniform directional frequency decompositions (NUDFB) under arbitrary scales. In this scheme, NUDFB is fulfilled by changing the topology structure of a non-symmetric binary tree (NSBT). Through this nonuniform division, we can get arbitrary orientation resolution r at a direction of c2-r under a target scale. Every two-channel filter bank on each node of this NSBT is designed to be a paraunitary perfect reconstruction filter bank, so NUDFB is an orthogonal filter bank. This dual multiresolution decomposition will definitely have bright prospect in its application, such as texture analysis, image processing or video coding. A potential application is presented by applying NUDFB in wavelet domain.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128745974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2008 IEEE 10th Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1