首页 > 最新文献

2008 IEEE 10th Workshop on Multimedia Signal Processing最新文献

英文 中文
Macroblock-based adaptive interpolation filter method using new filter selection in H.264/AVC H.264/AVC中基于宏块的自适应插值滤波方法
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665113
K. Yoon, J. H. Kim
The macroblock (MB)-based adaptive interpolation filter method has been considered to be able to achieve high coding efficiency in H.264/AVC. Although the conventional cost functions have showed a good performance in terms of rate and distortion, it still leaves room for improvement. To improve coding efficiency, we introduce a new cost function which considers two bit rates, motion vector and prediction error, and reconstruction error of MB. The filter which minimizes the proposed cost function is adaptively selected per MB. Experimental results show that the adaptive interpolation filter with the proposed cost function significantly improves the coding efficiency compared to ones using conventional cost function. It leads to about a 5.19% (1 reference frame) and 5.14% (5 reference frames) bit rate reduction on average compared to H.264/AVC, respectively.
基于宏块(macroblock, MB)的自适应插值滤波方法被认为能够在H.264/AVC中实现较高的编码效率。虽然传统的成本函数在速率和失真方面表现出了良好的性能,但仍有改进的空间。为了提高编码效率,我们引入了一个新的代价函数,该函数考虑了两个比特率、运动矢量和预测误差以及MB的重建误差,并自适应地选择了每MB最小代价函数的滤波器。实验结果表明,与使用传统代价函数的自适应插值滤波器相比,使用所提出的代价函数的自适应插值滤波器显著提高了编码效率。与H.264/AVC相比,平均比特率分别降低了5.19%(1个参考帧)和5.14%(5个参考帧)。
{"title":"Macroblock-based adaptive interpolation filter method using new filter selection in H.264/AVC","authors":"K. Yoon, J. H. Kim","doi":"10.1109/MMSP.2008.4665113","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665113","url":null,"abstract":"The macroblock (MB)-based adaptive interpolation filter method has been considered to be able to achieve high coding efficiency in H.264/AVC. Although the conventional cost functions have showed a good performance in terms of rate and distortion, it still leaves room for improvement. To improve coding efficiency, we introduce a new cost function which considers two bit rates, motion vector and prediction error, and reconstruction error of MB. The filter which minimizes the proposed cost function is adaptively selected per MB. Experimental results show that the adaptive interpolation filter with the proposed cost function significantly improves the coding efficiency compared to ones using conventional cost function. It leads to about a 5.19% (1 reference frame) and 5.14% (5 reference frames) bit rate reduction on average compared to H.264/AVC, respectively.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134345340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the systematic generation of Tardos’s fingerprinting codes 论塔尔多斯指纹码的系统生成
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665174
M. Kuribayashi, N. Akashi, M. Morii
Digital fingerprinting is used to trace back illegal users, where unique ID known as digital fingerprints is embedded into a content before distribution. On the generation of such fingerprints, one of the important properties is collusion-resistance. Binary codes for fingerprinting with a code length of theoretically minimum order were proposed by Tardos, and the related works mainly focused on the reduction of the code length were presented. In this paper, we present a concrete and systematic construction of the Tardospsilas fingerprinting code using a chaotic map. Using a statistical model for correlation scores, a proper threshold for detecting colluders is calculated. Furthermore, for the reduction of computational costs required for the detection, a hierarchical structure is introduced on the codewords. The collusion-resistance of the generated fingerprinting codes is evaluated by a computer simulation.
数字指纹技术是在内容发布前嵌入被称为数字指纹的唯一ID,用于追踪非法用户的技术。在生成此类指纹时,一个重要的特性是抗合谋性。Tardos提出了一种码长为理论最小阶的指纹二进制码,并对码长缩减进行了研究。在本文中,我们提出了一个具体的、系统的基于混沌映射的Tardospsilas指纹编码结构。利用相关分数的统计模型,计算出检测共谋的合适阈值。此外,为了减少检测所需的计算成本,在码字上引入了层次结构。通过计算机仿真,对所生成的指纹码的抗串通性能进行了评价。
{"title":"On the systematic generation of Tardos’s fingerprinting codes","authors":"M. Kuribayashi, N. Akashi, M. Morii","doi":"10.1109/MMSP.2008.4665174","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665174","url":null,"abstract":"Digital fingerprinting is used to trace back illegal users, where unique ID known as digital fingerprints is embedded into a content before distribution. On the generation of such fingerprints, one of the important properties is collusion-resistance. Binary codes for fingerprinting with a code length of theoretically minimum order were proposed by Tardos, and the related works mainly focused on the reduction of the code length were presented. In this paper, we present a concrete and systematic construction of the Tardospsilas fingerprinting code using a chaotic map. Using a statistical model for correlation scores, a proper threshold for detecting colluders is calculated. Furthermore, for the reduction of computational costs required for the detection, a hierarchical structure is introduced on the codewords. The collusion-resistance of the generated fingerprinting codes is evaluated by a computer simulation.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125064244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Efficient and effective transformed image identification 高效、有效的变换图像识别
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665141
M. Awrangjeb, Guojun Lu
The SIFT (scale invariant feature transform) has demonstrated its superior performance in identifying transformed images over many other approaches. However, both of its detection and matching stages are expensive, because a large number of keypoints are detected in the scale-space and each keypoint is described using a 128-dimensional vector. We present two possible solutions for feature-point reduction. First is to down scale the image before the SIFT keypoint detection and second is to use corners (instead of SIFT keypoints) which are visually significant, more robust, and much smaller in number than the SIFT keypoints. Either the curvature descriptor or the highly distinctive SIFT descriptors at corner locations can be used to represent corners.We then describe a new feature-point matching technique, which can be used for matching both the down-scaled SIFT keypoints and corners. Experimental results show that two feature-point reduction solutions combined with the SIFT descriptors and the proposed feature-point matching technique not only improve the computational efficiency and decrease the storage requirement, but also improve the transformed image identification accuracy (robustness).
与许多其他方法相比,SIFT(尺度不变特征变换)在识别变换后的图像方面表现出优越的性能。然而,它的检测和匹配阶段都是昂贵的,因为在尺度空间中检测到大量的关键点,并且每个关键点都使用128维向量来描述。我们提出了两种可能的特征点约简解决方案。首先是在SIFT关键点检测之前缩小图像的比例,其次是使用角点(而不是SIFT关键点),这些角点在视觉上更重要,更鲁棒,而且数量比SIFT关键点少得多。曲率描述子或角点位置高度不同的SIFT描述子都可以用来表示角点。然后,我们描述了一种新的特征点匹配技术,该技术可以用于匹配缩小后的SIFT关键点和角点。实验结果表明,结合SIFT描述子和特征点匹配技术的两种特征点约简方案不仅提高了计算效率,降低了存储要求,而且提高了变换图像的识别精度(鲁棒性)。
{"title":"Efficient and effective transformed image identification","authors":"M. Awrangjeb, Guojun Lu","doi":"10.1109/MMSP.2008.4665141","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665141","url":null,"abstract":"The SIFT (scale invariant feature transform) has demonstrated its superior performance in identifying transformed images over many other approaches. However, both of its detection and matching stages are expensive, because a large number of keypoints are detected in the scale-space and each keypoint is described using a 128-dimensional vector. We present two possible solutions for feature-point reduction. First is to down scale the image before the SIFT keypoint detection and second is to use corners (instead of SIFT keypoints) which are visually significant, more robust, and much smaller in number than the SIFT keypoints. Either the curvature descriptor or the highly distinctive SIFT descriptors at corner locations can be used to represent corners.We then describe a new feature-point matching technique, which can be used for matching both the down-scaled SIFT keypoints and corners. Experimental results show that two feature-point reduction solutions combined with the SIFT descriptors and the proposed feature-point matching technique not only improve the computational efficiency and decrease the storage requirement, but also improve the transformed image identification accuracy (robustness).","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132234830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Region-based image categorization with reduced feature set 基于区域特征集的图像分类
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665145
G. Herman, G. Ye, Jie Xu, Bang Zhang
In this paper we propose a new algorithm for region-based image categorization that is formulated as a multiple instance learning (MIL) problem. The proposed algorithm transforms the MIL problem into a traditional supervised learning problem, and solves it using a standard supervised learning method. The features used in the proposed algorithm are the hyperclique patterns which are ldquocondensedrdquo into a small set of discriminative features. Each hyperclique pattern consists of multiple strongly-correlated instances (i.e., features). As a result, hyperclique patterns are able to capture the information that are not shared by individual features. The advantages of the proposed algorithm over existing algorithms are threefold: (i) unlike some existing algorithms which use learning methods that are specifically designed for MIL or for certain datasets, the proposed algorithm uses a general-purpose standard supervised learning method, (ii) it uses a significantly small set of features which are empirically more discriminative than the PCA features (i.e. principal components), and (iii) it is simple and efficient and achieves a comparable performance to most state-of-the-art algorithms. The efficiency and good performance of the proposed algorithm make it a practical solution to general MIL problems. In this paper, we apply the proposed algorithm to both drug activity prediction and image categorization, and promising results are obtained.
本文提出了一种新的基于区域的图像分类算法,该算法被表述为一个多实例学习(MIL)问题。该算法将MIL问题转化为传统的监督学习问题,并采用标准的监督学习方法进行求解。该算法使用的特征是超团模式,这些特征被压缩成一组小的判别特征。每个超级集团模式由多个强相关的实例(即特征)组成。因此,超级集团模式能够捕获不被单个特征共享的信息。与现有算法相比,本文提出的算法有三个优点:(i)与一些现有算法不同,这些算法使用专门为MIL或某些数据集设计的学习方法,所提出的算法使用通用的标准监督学习方法,(ii)它使用了一组显著小的特征,这些特征在经验上比PCA特征(即主成分)更具判别性,(iii)它简单有效,并实现了与大多数最先进的算法相当的性能。该算法的效率和良好的性能使其成为一般MIL问题的实用解决方案。在本文中,我们将该算法应用于药物活性预测和图像分类,并获得了令人满意的结果。
{"title":"Region-based image categorization with reduced feature set","authors":"G. Herman, G. Ye, Jie Xu, Bang Zhang","doi":"10.1109/MMSP.2008.4665145","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665145","url":null,"abstract":"In this paper we propose a new algorithm for region-based image categorization that is formulated as a multiple instance learning (MIL) problem. The proposed algorithm transforms the MIL problem into a traditional supervised learning problem, and solves it using a standard supervised learning method. The features used in the proposed algorithm are the hyperclique patterns which are ldquocondensedrdquo into a small set of discriminative features. Each hyperclique pattern consists of multiple strongly-correlated instances (i.e., features). As a result, hyperclique patterns are able to capture the information that are not shared by individual features. The advantages of the proposed algorithm over existing algorithms are threefold: (i) unlike some existing algorithms which use learning methods that are specifically designed for MIL or for certain datasets, the proposed algorithm uses a general-purpose standard supervised learning method, (ii) it uses a significantly small set of features which are empirically more discriminative than the PCA features (i.e. principal components), and (iii) it is simple and efficient and achieves a comparable performance to most state-of-the-art algorithms. The efficiency and good performance of the proposed algorithm make it a practical solution to general MIL problems. In this paper, we apply the proposed algorithm to both drug activity prediction and image categorization, and promising results are obtained.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133243872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
The SAIL speaker diarization system for analysis of spontaneous meetings 用于分析自发会议的SAIL扬声器分类系统
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665214
Kyu Jeong Han, P. Georgiou, Shrikanth S. Narayanan
In this paper, we propose a novel approach to speaker diarization of spontaneous meetings in our own multimodal SmartRoom environment. The proposed speaker diarization system first applies a sequential clustering concept to segmentation of a given audio data source, and then performs agglomerative hierarchical clustering for speaker-specific classification (or speaker clustering) of speech segments. The speaker clustering algorithm utilizes an incremental Gaussian mixture cluster modeling strategy, and a stopping point estimation method based on information change rate. Through experiments on various meeting conversation data of approximately 200 minutes total length, this system is demonstrated to provide diarization error rate of 18.90% on average.
在本文中,我们提出了一种新颖的方法,在我们自己的多模态SmartRoom环境中对自发会议的演讲者进行分组。该系统首先将顺序聚类概念应用于给定音频数据源的分割,然后对语音片段进行特定于说话人的聚类(或说话人聚类)。说话人聚类算法采用增量高斯混合聚类建模策略和基于信息变化率的停止点估计方法。通过对总时长约200分钟的各种会议会话数据的实验,证明该系统的拨号错误率平均为18.90%。
{"title":"The SAIL speaker diarization system for analysis of spontaneous meetings","authors":"Kyu Jeong Han, P. Georgiou, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2008.4665214","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665214","url":null,"abstract":"In this paper, we propose a novel approach to speaker diarization of spontaneous meetings in our own multimodal SmartRoom environment. The proposed speaker diarization system first applies a sequential clustering concept to segmentation of a given audio data source, and then performs agglomerative hierarchical clustering for speaker-specific classification (or speaker clustering) of speech segments. The speaker clustering algorithm utilizes an incremental Gaussian mixture cluster modeling strategy, and a stopping point estimation method based on information change rate. Through experiments on various meeting conversation data of approximately 200 minutes total length, this system is demonstrated to provide diarization error rate of 18.90% on average.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133391583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Developing a smart camera for road traffic surveillance 开发用于道路交通监控的智能摄像头
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665188
Bei Na Wei, Yu Shi, G. Ye, Jie Xu
Smart camera system design and implementation is a challenging task due to the constant need to perform computationally demanding image processing tasks with the limited resource constraints of embedded systems. This paper presents the hardware and software co-design and implementation of the first stage of TraffiCam, an FPGA based smart camera prototype for traffic surveillance at intersections, consisting of a CMOS image sensor capture device and FPGA main video processor. In particular, creative solutions for balancing gate array utilization, memory and computation time are presented for the initial stage of Harris keypoint detection with discussions on the algorithm implementation conversions between PC-based to FPGA based platforms. Preliminary results show satisfactory real-time tracking and estimation performance.
智能相机系统的设计和实现是一项具有挑战性的任务,因为在嵌入式系统有限的资源约束下,需要不断执行计算要求高的图像处理任务。本文介绍了基于FPGA的交叉口交通监控智能摄像头TraffiCam第一阶段的硬件和软件协同设计与实现,该原型机由CMOS图像传感器捕获器件和FPGA主视频处理器组成。特别是,在Harris关键点检测的初始阶段,提出了平衡门阵列利用率、内存和计算时间的创造性解决方案,并讨论了基于pc的平台到基于FPGA的平台之间的算法实现转换。初步结果显示了令人满意的实时跟踪和估计性能。
{"title":"Developing a smart camera for road traffic surveillance","authors":"Bei Na Wei, Yu Shi, G. Ye, Jie Xu","doi":"10.1109/MMSP.2008.4665188","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665188","url":null,"abstract":"Smart camera system design and implementation is a challenging task due to the constant need to perform computationally demanding image processing tasks with the limited resource constraints of embedded systems. This paper presents the hardware and software co-design and implementation of the first stage of TraffiCam, an FPGA based smart camera prototype for traffic surveillance at intersections, consisting of a CMOS image sensor capture device and FPGA main video processor. In particular, creative solutions for balancing gate array utilization, memory and computation time are presented for the initial stage of Harris keypoint detection with discussions on the algorithm implementation conversions between PC-based to FPGA based platforms. Preliminary results show satisfactory real-time tracking and estimation performance.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116644709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
When multimedia advertising meets the new Internet era 当多媒体广告遇到新的互联网时代
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665039
Xiansheng Hua, Tao Mei, Shipeng Li
The advent of media-sharing sites, especially along with the so called Web 2.0 wave, has led to the unprecedented Internet delivery of community-contributed media contents such as images and videos, which have become the primary sources for online advertising. However, conventional ad-networks such as Google Adwords and AdSense treat image and video advertising as general text advertising by displaying the ads either relevant to the queries or the Web page content, without considering automatically monetizing the rich contents of individual images and videos. In this paper, we summarize the trends of online advertising and propose an innovative advertising model driven by the compelling contents of images and videos. We present recently developed ImageSense and VideoSense as two exemplary applications dedicated to images and videos, respectively, in which the most contextually relevant ads are embedded at the most appropriate positions within the images or videos. The ads are selected based on not only textual relevance but also visual similarity so that the ads yield contextual relevance to both the text in the Web page and the visual content. The ad insertion positions are detected based on visual saliency analysis to minimize the intrusiveness to the user. We also envision that the next trend of multimedia advertising would be game-alike advertising.
媒体共享网站的出现,特别是伴随着所谓的Web 2.0浪潮,导致了前所未有的社区贡献媒体内容(如图像和视频)的互联网交付,这些内容已成为在线广告的主要来源。然而,传统的广告网络,如Google Adwords和AdSense,通过显示与查询或网页内容相关的广告,将图像和视频广告视为一般的文本广告,而不考虑自动将单个图像和视频的丰富内容货币化。在本文中,我们总结了在线广告的趋势,并提出了一种由引人注目的图像和视频内容驱动的创新广告模式。我们介绍了最近开发的ImageSense和VideoSense,分别作为图像和视频的两个示例应用程序,其中最相关的广告嵌入在图像或视频中最合适的位置。广告的选择不仅基于文本相关性,还基于视觉相似性,以便广告产生与网页中的文本和视觉内容的上下文相关性。基于视觉显著性分析检测广告插入位置,以尽量减少对用户的侵入性。我们还设想,多媒体广告的下一个趋势将是游戏式广告。
{"title":"When multimedia advertising meets the new Internet era","authors":"Xiansheng Hua, Tao Mei, Shipeng Li","doi":"10.1109/MMSP.2008.4665039","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665039","url":null,"abstract":"The advent of media-sharing sites, especially along with the so called Web 2.0 wave, has led to the unprecedented Internet delivery of community-contributed media contents such as images and videos, which have become the primary sources for online advertising. However, conventional ad-networks such as Google Adwords and AdSense treat image and video advertising as general text advertising by displaying the ads either relevant to the queries or the Web page content, without considering automatically monetizing the rich contents of individual images and videos. In this paper, we summarize the trends of online advertising and propose an innovative advertising model driven by the compelling contents of images and videos. We present recently developed ImageSense and VideoSense as two exemplary applications dedicated to images and videos, respectively, in which the most contextually relevant ads are embedded at the most appropriate positions within the images or videos. The ads are selected based on not only textual relevance but also visual similarity so that the ads yield contextual relevance to both the text in the Web page and the visual content. The ad insertion positions are detected based on visual saliency analysis to minimize the intrusiveness to the user. We also envision that the next trend of multimedia advertising would be game-alike advertising.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124658042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
2-D dual multiresolution decomposition through NUDFB and its application 基于NUDFB的二维双多分辨率分解及其应用
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665131
Nannan Ma, H. Xiong, Li Song
This paper aims to attain sparser representation of a 2-D signal by introducing orientation resolution as a second multiresolution besides multiscale, which is formulated to achieve a dual multiresolution decomposition framework by nonuniform directional frequency decompositions (NUDFB) under arbitrary scales. In this scheme, NUDFB is fulfilled by changing the topology structure of a non-symmetric binary tree (NSBT). Through this nonuniform division, we can get arbitrary orientation resolution r at a direction of c2-r under a target scale. Every two-channel filter bank on each node of this NSBT is designed to be a paraunitary perfect reconstruction filter bank, so NUDFB is an orthogonal filter bank. This dual multiresolution decomposition will definitely have bright prospect in its application, such as texture analysis, image processing or video coding. A potential application is presented by applying NUDFB in wavelet domain.
本文旨在通过引入定向分辨率作为多尺度之外的第二多分辨率来获得二维信号的稀疏表示,该多分辨率是为了在任意尺度下通过非均匀定向频率分解(NUDFB)实现双多分辨率分解框架而制定的。在该方案中,通过改变非对称二叉树(NSBT)的拓扑结构来实现NUDFB。通过这种非均匀划分,我们可以在目标尺度下,在c2-r方向上得到任意方向分辨率r。该NSBT的每个节点上的每个双通道滤波器组都被设计为一个准酉完美重构滤波器组,因此NUDFB是一个正交滤波器组。这种双多分辨率分解方法在纹理分析、图像处理或视频编码等方面具有广阔的应用前景。提出了在小波域应用NUDFB的一种潜在应用。
{"title":"2-D dual multiresolution decomposition through NUDFB and its application","authors":"Nannan Ma, H. Xiong, Li Song","doi":"10.1109/MMSP.2008.4665131","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665131","url":null,"abstract":"This paper aims to attain sparser representation of a 2-D signal by introducing orientation resolution as a second multiresolution besides multiscale, which is formulated to achieve a dual multiresolution decomposition framework by nonuniform directional frequency decompositions (NUDFB) under arbitrary scales. In this scheme, NUDFB is fulfilled by changing the topology structure of a non-symmetric binary tree (NSBT). Through this nonuniform division, we can get arbitrary orientation resolution r at a direction of c2-r under a target scale. Every two-channel filter bank on each node of this NSBT is designed to be a paraunitary perfect reconstruction filter bank, so NUDFB is an orthogonal filter bank. This dual multiresolution decomposition will definitely have bright prospect in its application, such as texture analysis, image processing or video coding. A potential application is presented by applying NUDFB in wavelet domain.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128745974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Segmentation of characters on car license plates 车牌字符分割
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665111
Xiangjian He, Lihong Zheng, Qiang Wu, W. Jia, B. Samali, M. Palaniswami
License plate recognition usually contains three steps, namely license plate detection/localization, character segmentation and character recognition. When reading characters on a license plate one by one after license plate detection step, it is crucial to accurately segment the characters. The segmentation step may be affected by many factors such as license plate boundaries (frames). The recognition accuracy will be significantly reduced if the characters are not properly segmented. This paper presents an efficient algorithm for character segmentation on a license plate. The algorithm follows the step that detects the license plates using an AdaBoost algorithm. It is based on an efficient and accurate skew and slant correction of license plates, and works together with boundary (frame) removal of license plates. The algorithm is efficient and can be applied in real-time applications. The experiments are performed to show the accuracy of segmentation.
车牌识别通常包含三个步骤,即车牌检测/定位、字符分割和字符识别。在车牌检测步骤后逐个读取车牌上的字符时,准确分割字符是至关重要的。分割步骤可能受到车牌边界(帧)等诸多因素的影响。如果字符分割不当,识别精度将大大降低。提出了一种高效的车牌字符分割算法。该算法遵循使用AdaBoost算法检测车牌的步骤。该方法基于对车牌进行高效、准确的斜、斜校正,并与车牌的边界(框)去除协同工作。该算法效率高,可用于实时应用。通过实验验证了分割的准确性。
{"title":"Segmentation of characters on car license plates","authors":"Xiangjian He, Lihong Zheng, Qiang Wu, W. Jia, B. Samali, M. Palaniswami","doi":"10.1109/MMSP.2008.4665111","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665111","url":null,"abstract":"License plate recognition usually contains three steps, namely license plate detection/localization, character segmentation and character recognition. When reading characters on a license plate one by one after license plate detection step, it is crucial to accurately segment the characters. The segmentation step may be affected by many factors such as license plate boundaries (frames). The recognition accuracy will be significantly reduced if the characters are not properly segmented. This paper presents an efficient algorithm for character segmentation on a license plate. The algorithm follows the step that detects the license plates using an AdaBoost algorithm. It is based on an efficient and accurate skew and slant correction of license plates, and works together with boundary (frame) removal of license plates. The algorithm is efficient and can be applied in real-time applications. The experiments are performed to show the accuracy of segmentation.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129896721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Low-complexity frame importance modelling and resource allocation scheme for error-resilience H.264 video streaming H.264视频流的低复杂度帧重要性建模和资源分配方案
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665185
Gang Sun, Wei Xing, Dongming Lu
In this paper, we addressed the problem of redundancy allocation for protecting packet loss for better quality of service (QoS) in real-time H.264 video streaming. A novel error-resilient approach is proposed for the transmission of pre-encoded H.264 video stream under bandwidth constrained networks. A novel frame importance model is derived for estimating relative importance index for different H.264 video frames. Combining with the characteristics of the network, the optimal resource allocation strategy for different video frames can be determined for achieving improved error resilience. The model uses frame error propagation index (FEPI) to characterize video quality degradation caused by error propagation in different frames in a GOP when suffer from packet loss. This model can be calculated in DCT domain with the parameters extracted directly from the bitstream. Therefore, the complexity of the proposed scheme is very low and much better for real-time video transmission. Simulation results show that the proposed scheme can improve the receiver side reconstructed video quality remarkably under different channel loss patterns.
在本文中,我们解决了在实时H.264视频流中保护丢包以获得更好的服务质量(QoS)的冗余分配问题。针对带宽受限的网络中预编码H.264视频流的传输,提出了一种新的容错方法。提出了一种新的帧重要性模型,用于估计不同H.264视频帧的相对重要性指数。结合网络的特点,确定不同视频帧的最优资源分配策略,以达到提高容错性的目的。该模型使用帧错误传播指数(FEPI)来表征丢包时,在GOP中不同帧中错误传播引起的视频质量下降。该模型可以直接从比特流中提取参数,在DCT域中进行计算。因此,该方案的复杂度很低,更适合实时视频传输。仿真结果表明,在不同信道损耗模式下,该方案能显著提高接收端重构视频的质量。
{"title":"Low-complexity frame importance modelling and resource allocation scheme for error-resilience H.264 video streaming","authors":"Gang Sun, Wei Xing, Dongming Lu","doi":"10.1109/MMSP.2008.4665185","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665185","url":null,"abstract":"In this paper, we addressed the problem of redundancy allocation for protecting packet loss for better quality of service (QoS) in real-time H.264 video streaming. A novel error-resilient approach is proposed for the transmission of pre-encoded H.264 video stream under bandwidth constrained networks. A novel frame importance model is derived for estimating relative importance index for different H.264 video frames. Combining with the characteristics of the network, the optimal resource allocation strategy for different video frames can be determined for achieving improved error resilience. The model uses frame error propagation index (FEPI) to characterize video quality degradation caused by error propagation in different frames in a GOP when suffer from packet loss. This model can be calculated in DCT domain with the parameters extracted directly from the bitstream. Therefore, the complexity of the proposed scheme is very low and much better for real-time video transmission. Simulation results show that the proposed scheme can improve the receiver side reconstructed video quality remarkably under different channel loss patterns.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130899045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2008 IEEE 10th Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1