首页 > 最新文献

2008 IEEE 10th Workshop on Multimedia Signal Processing最新文献

英文 中文
Efficient stereo bitrate allocation for fully scalable audio codec 有效的立体声比特率分配完全可扩展的音频编解码器
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665206
Te Li, S. Rahardja, S. Koh
The bit allocation algorithm for stereo channels in MPEG-4 scalable lossless coding (SLS) is not optimized. A perceptually enhanced stereo bit allocation algorithm for fully scalable audio coding is presented in this paper. According to the energy distribution in different channels, the bitrate is allocated in a much more efficient manner. Experiment results show that the proposed method significantly improves the perceptual quality of the fully scalable audio at various bitrates without introducing any new side information.
MPEG-4可扩展无损编码(SLS)中立体声信道的位分配算法没有得到优化。本文提出了一种感知增强的立体声位分配算法,用于完全可扩展的音频编码。根据不同信道的能量分布,可以更有效地分配比特率。实验结果表明,该方法在不引入任何新的侧信息的情况下,显著提高了不同比特率下全可扩展音频的感知质量。
{"title":"Efficient stereo bitrate allocation for fully scalable audio codec","authors":"Te Li, S. Rahardja, S. Koh","doi":"10.1109/MMSP.2008.4665206","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665206","url":null,"abstract":"The bit allocation algorithm for stereo channels in MPEG-4 scalable lossless coding (SLS) is not optimized. A perceptually enhanced stereo bit allocation algorithm for fully scalable audio coding is presented in this paper. According to the energy distribution in different channels, the bitrate is allocated in a much more efficient manner. Experiment results show that the proposed method significantly improves the perceptual quality of the fully scalable audio at various bitrates without introducing any new side information.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127341299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing a smart camera for road traffic surveillance 开发用于道路交通监控的智能摄像头
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665188
Bei Na Wei, Yu Shi, G. Ye, Jie Xu
Smart camera system design and implementation is a challenging task due to the constant need to perform computationally demanding image processing tasks with the limited resource constraints of embedded systems. This paper presents the hardware and software co-design and implementation of the first stage of TraffiCam, an FPGA based smart camera prototype for traffic surveillance at intersections, consisting of a CMOS image sensor capture device and FPGA main video processor. In particular, creative solutions for balancing gate array utilization, memory and computation time are presented for the initial stage of Harris keypoint detection with discussions on the algorithm implementation conversions between PC-based to FPGA based platforms. Preliminary results show satisfactory real-time tracking and estimation performance.
智能相机系统的设计和实现是一项具有挑战性的任务,因为在嵌入式系统有限的资源约束下,需要不断执行计算要求高的图像处理任务。本文介绍了基于FPGA的交叉口交通监控智能摄像头TraffiCam第一阶段的硬件和软件协同设计与实现,该原型机由CMOS图像传感器捕获器件和FPGA主视频处理器组成。特别是,在Harris关键点检测的初始阶段,提出了平衡门阵列利用率、内存和计算时间的创造性解决方案,并讨论了基于pc的平台到基于FPGA的平台之间的算法实现转换。初步结果显示了令人满意的实时跟踪和估计性能。
{"title":"Developing a smart camera for road traffic surveillance","authors":"Bei Na Wei, Yu Shi, G. Ye, Jie Xu","doi":"10.1109/MMSP.2008.4665188","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665188","url":null,"abstract":"Smart camera system design and implementation is a challenging task due to the constant need to perform computationally demanding image processing tasks with the limited resource constraints of embedded systems. This paper presents the hardware and software co-design and implementation of the first stage of TraffiCam, an FPGA based smart camera prototype for traffic surveillance at intersections, consisting of a CMOS image sensor capture device and FPGA main video processor. In particular, creative solutions for balancing gate array utilization, memory and computation time are presented for the initial stage of Harris keypoint detection with discussions on the algorithm implementation conversions between PC-based to FPGA based platforms. Preliminary results show satisfactory real-time tracking and estimation performance.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116644709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Efficient and effective transformed image identification 高效、有效的变换图像识别
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665141
M. Awrangjeb, Guojun Lu
The SIFT (scale invariant feature transform) has demonstrated its superior performance in identifying transformed images over many other approaches. However, both of its detection and matching stages are expensive, because a large number of keypoints are detected in the scale-space and each keypoint is described using a 128-dimensional vector. We present two possible solutions for feature-point reduction. First is to down scale the image before the SIFT keypoint detection and second is to use corners (instead of SIFT keypoints) which are visually significant, more robust, and much smaller in number than the SIFT keypoints. Either the curvature descriptor or the highly distinctive SIFT descriptors at corner locations can be used to represent corners.We then describe a new feature-point matching technique, which can be used for matching both the down-scaled SIFT keypoints and corners. Experimental results show that two feature-point reduction solutions combined with the SIFT descriptors and the proposed feature-point matching technique not only improve the computational efficiency and decrease the storage requirement, but also improve the transformed image identification accuracy (robustness).
与许多其他方法相比,SIFT(尺度不变特征变换)在识别变换后的图像方面表现出优越的性能。然而,它的检测和匹配阶段都是昂贵的,因为在尺度空间中检测到大量的关键点,并且每个关键点都使用128维向量来描述。我们提出了两种可能的特征点约简解决方案。首先是在SIFT关键点检测之前缩小图像的比例,其次是使用角点(而不是SIFT关键点),这些角点在视觉上更重要,更鲁棒,而且数量比SIFT关键点少得多。曲率描述子或角点位置高度不同的SIFT描述子都可以用来表示角点。然后,我们描述了一种新的特征点匹配技术,该技术可以用于匹配缩小后的SIFT关键点和角点。实验结果表明,结合SIFT描述子和特征点匹配技术的两种特征点约简方案不仅提高了计算效率,降低了存储要求,而且提高了变换图像的识别精度(鲁棒性)。
{"title":"Efficient and effective transformed image identification","authors":"M. Awrangjeb, Guojun Lu","doi":"10.1109/MMSP.2008.4665141","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665141","url":null,"abstract":"The SIFT (scale invariant feature transform) has demonstrated its superior performance in identifying transformed images over many other approaches. However, both of its detection and matching stages are expensive, because a large number of keypoints are detected in the scale-space and each keypoint is described using a 128-dimensional vector. We present two possible solutions for feature-point reduction. First is to down scale the image before the SIFT keypoint detection and second is to use corners (instead of SIFT keypoints) which are visually significant, more robust, and much smaller in number than the SIFT keypoints. Either the curvature descriptor or the highly distinctive SIFT descriptors at corner locations can be used to represent corners.We then describe a new feature-point matching technique, which can be used for matching both the down-scaled SIFT keypoints and corners. Experimental results show that two feature-point reduction solutions combined with the SIFT descriptors and the proposed feature-point matching technique not only improve the computational efficiency and decrease the storage requirement, but also improve the transformed image identification accuracy (robustness).","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132234830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Region-based image categorization with reduced feature set 基于区域特征集的图像分类
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665145
G. Herman, G. Ye, Jie Xu, Bang Zhang
In this paper we propose a new algorithm for region-based image categorization that is formulated as a multiple instance learning (MIL) problem. The proposed algorithm transforms the MIL problem into a traditional supervised learning problem, and solves it using a standard supervised learning method. The features used in the proposed algorithm are the hyperclique patterns which are ldquocondensedrdquo into a small set of discriminative features. Each hyperclique pattern consists of multiple strongly-correlated instances (i.e., features). As a result, hyperclique patterns are able to capture the information that are not shared by individual features. The advantages of the proposed algorithm over existing algorithms are threefold: (i) unlike some existing algorithms which use learning methods that are specifically designed for MIL or for certain datasets, the proposed algorithm uses a general-purpose standard supervised learning method, (ii) it uses a significantly small set of features which are empirically more discriminative than the PCA features (i.e. principal components), and (iii) it is simple and efficient and achieves a comparable performance to most state-of-the-art algorithms. The efficiency and good performance of the proposed algorithm make it a practical solution to general MIL problems. In this paper, we apply the proposed algorithm to both drug activity prediction and image categorization, and promising results are obtained.
本文提出了一种新的基于区域的图像分类算法,该算法被表述为一个多实例学习(MIL)问题。该算法将MIL问题转化为传统的监督学习问题,并采用标准的监督学习方法进行求解。该算法使用的特征是超团模式,这些特征被压缩成一组小的判别特征。每个超级集团模式由多个强相关的实例(即特征)组成。因此,超级集团模式能够捕获不被单个特征共享的信息。与现有算法相比,本文提出的算法有三个优点:(i)与一些现有算法不同,这些算法使用专门为MIL或某些数据集设计的学习方法,所提出的算法使用通用的标准监督学习方法,(ii)它使用了一组显著小的特征,这些特征在经验上比PCA特征(即主成分)更具判别性,(iii)它简单有效,并实现了与大多数最先进的算法相当的性能。该算法的效率和良好的性能使其成为一般MIL问题的实用解决方案。在本文中,我们将该算法应用于药物活性预测和图像分类,并获得了令人满意的结果。
{"title":"Region-based image categorization with reduced feature set","authors":"G. Herman, G. Ye, Jie Xu, Bang Zhang","doi":"10.1109/MMSP.2008.4665145","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665145","url":null,"abstract":"In this paper we propose a new algorithm for region-based image categorization that is formulated as a multiple instance learning (MIL) problem. The proposed algorithm transforms the MIL problem into a traditional supervised learning problem, and solves it using a standard supervised learning method. The features used in the proposed algorithm are the hyperclique patterns which are ldquocondensedrdquo into a small set of discriminative features. Each hyperclique pattern consists of multiple strongly-correlated instances (i.e., features). As a result, hyperclique patterns are able to capture the information that are not shared by individual features. The advantages of the proposed algorithm over existing algorithms are threefold: (i) unlike some existing algorithms which use learning methods that are specifically designed for MIL or for certain datasets, the proposed algorithm uses a general-purpose standard supervised learning method, (ii) it uses a significantly small set of features which are empirically more discriminative than the PCA features (i.e. principal components), and (iii) it is simple and efficient and achieves a comparable performance to most state-of-the-art algorithms. The efficiency and good performance of the proposed algorithm make it a practical solution to general MIL problems. In this paper, we apply the proposed algorithm to both drug activity prediction and image categorization, and promising results are obtained.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133243872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
The SAIL speaker diarization system for analysis of spontaneous meetings 用于分析自发会议的SAIL扬声器分类系统
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665214
Kyu Jeong Han, P. Georgiou, Shrikanth S. Narayanan
In this paper, we propose a novel approach to speaker diarization of spontaneous meetings in our own multimodal SmartRoom environment. The proposed speaker diarization system first applies a sequential clustering concept to segmentation of a given audio data source, and then performs agglomerative hierarchical clustering for speaker-specific classification (or speaker clustering) of speech segments. The speaker clustering algorithm utilizes an incremental Gaussian mixture cluster modeling strategy, and a stopping point estimation method based on information change rate. Through experiments on various meeting conversation data of approximately 200 minutes total length, this system is demonstrated to provide diarization error rate of 18.90% on average.
在本文中,我们提出了一种新颖的方法,在我们自己的多模态SmartRoom环境中对自发会议的演讲者进行分组。该系统首先将顺序聚类概念应用于给定音频数据源的分割,然后对语音片段进行特定于说话人的聚类(或说话人聚类)。说话人聚类算法采用增量高斯混合聚类建模策略和基于信息变化率的停止点估计方法。通过对总时长约200分钟的各种会议会话数据的实验,证明该系统的拨号错误率平均为18.90%。
{"title":"The SAIL speaker diarization system for analysis of spontaneous meetings","authors":"Kyu Jeong Han, P. Georgiou, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2008.4665214","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665214","url":null,"abstract":"In this paper, we propose a novel approach to speaker diarization of spontaneous meetings in our own multimodal SmartRoom environment. The proposed speaker diarization system first applies a sequential clustering concept to segmentation of a given audio data source, and then performs agglomerative hierarchical clustering for speaker-specific classification (or speaker clustering) of speech segments. The speaker clustering algorithm utilizes an incremental Gaussian mixture cluster modeling strategy, and a stopping point estimation method based on information change rate. Through experiments on various meeting conversation data of approximately 200 minutes total length, this system is demonstrated to provide diarization error rate of 18.90% on average.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133391583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Graphical modeling and decoding of human actions 人类行为的图形化建模和解码
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665070
W. Li, Zhengyou Zhang, Zicheng Liu
This paper presents a graphical model for learning and recognizing human actions. Specifically, we propose to encode actions in a weighted directed graph, referred to as action graph, where nodes of the graph represent salient postures that are used to characterize the actions and shared by all actions. The weight between two nodes measures the transitional probability between the two postures. An action is encoded as one or multiple paths in the action graph. The salient postures are modeled using Gaussian mixture models (GMM). Both the salient postures and action graph are automatically learned from training samples through unsupervised clustering and expectation and maximization (EM) algorithm. Experimental results have verified the performance of the proposed model, its tolerance to noise and viewpoints and its robustness across different subjects and datasets.
本文提出了一个学习和识别人类行为的图形模型。具体来说,我们建议在加权有向图中编码动作,称为动作图,其中图的节点表示用于表征动作并由所有动作共享的显著姿势。两个节点之间的权重衡量了两个姿势之间的过渡概率。动作在动作图中被编码为一条或多条路径。突出姿态采用高斯混合模型(GMM)建模。通过无监督聚类和期望最大化(EM)算法从训练样本中自动学习显著姿态和动作图。实验结果验证了该模型的性能、对噪声和视点的容忍度以及对不同主题和数据集的鲁棒性。
{"title":"Graphical modeling and decoding of human actions","authors":"W. Li, Zhengyou Zhang, Zicheng Liu","doi":"10.1109/MMSP.2008.4665070","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665070","url":null,"abstract":"This paper presents a graphical model for learning and recognizing human actions. Specifically, we propose to encode actions in a weighted directed graph, referred to as action graph, where nodes of the graph represent salient postures that are used to characterize the actions and shared by all actions. The weight between two nodes measures the transitional probability between the two postures. An action is encoded as one or multiple paths in the action graph. The salient postures are modeled using Gaussian mixture models (GMM). Both the salient postures and action graph are automatically learned from training samples through unsupervised clustering and expectation and maximization (EM) algorithm. Experimental results have verified the performance of the proposed model, its tolerance to noise and viewpoints and its robustness across different subjects and datasets.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122531784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
CCL-SVC: Optimizing user experience of broadcasting video on computation capability limited handheld devices CCL-SVC:优化在计算能力有限的手持设备上播放视频的用户体验
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665117
Jingyuan Wang, Lifeng Sun, Bin Li, Meng Zhang, Shiqiang Yang
In this paper, we propose a novel scheme using computing complexity layered scalable video coding (CCLSVC) to optimize the user experience of broadcasting video in the computing capability limited handheld terminals. To address the heterogeneity of computing capability among different handheld devices, we employ hierarchal B reference structure of SVC to divide the frames into multiple computing complexity layers (CC Layers) in server side. The handheld clients simply choose to decode the frames in their corresponding layers in terms of their computation capability to maximize the video PSNR. We have proved that the optimal CC Layers division problem is a precedence constrained scheduling problem, which is an NP-complete problem. And we further propose our fast greedy method to approximately get optimized broadcasting video playback PSNR. The simulation shows that our method is superior to temporal SVC and random frame discarding method.
本文提出了一种基于计算复杂度分层可扩展视频编码(CCLSVC)的新方案,以优化在计算能力有限的手持终端下播放视频的用户体验。为了解决不同手持设备之间计算能力的异质性,我们采用SVC的层次B参考结构将帧在服务器端划分为多个计算复杂性层(CC层)。手持客户端根据其计算能力简单地选择解码相应层中的帧,以最大化视频的PSNR。证明了最优CC层划分问题是一个有优先级约束的调度问题,是一个np完全问题。在此基础上,提出了一种快速贪婪算法来近似求解优化后的广播视频播放PSNR。仿真结果表明,该方法优于时间支持向量机和随机帧丢弃方法。
{"title":"CCL-SVC: Optimizing user experience of broadcasting video on computation capability limited handheld devices","authors":"Jingyuan Wang, Lifeng Sun, Bin Li, Meng Zhang, Shiqiang Yang","doi":"10.1109/MMSP.2008.4665117","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665117","url":null,"abstract":"In this paper, we propose a novel scheme using computing complexity layered scalable video coding (CCLSVC) to optimize the user experience of broadcasting video in the computing capability limited handheld terminals. To address the heterogeneity of computing capability among different handheld devices, we employ hierarchal B reference structure of SVC to divide the frames into multiple computing complexity layers (CC Layers) in server side. The handheld clients simply choose to decode the frames in their corresponding layers in terms of their computation capability to maximize the video PSNR. We have proved that the optimal CC Layers division problem is a precedence constrained scheduling problem, which is an NP-complete problem. And we further propose our fast greedy method to approximately get optimized broadcasting video playback PSNR. The simulation shows that our method is superior to temporal SVC and random frame discarding method.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123861336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sparse approximations for joint source-channel coding 联合信源信道编码的稀疏逼近
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665126
G. Rath, C. Guillemot, J. Fuchs
This paper considers the application of sparse approximations in a joint source-channel (JSC) coding framework. The considered JSC coded system employs a real number BCH code on the input signal before the signal is quantized and further processed. Under an impulse channel noise model, the decoding of error is posed as a sparse approximation problem. The orthogonal matching pursuit (OMP) and basis pursuit (BP) algorithms are compared with the syndrome decoding algorithm in terms of mean square reconstruction error. It is seen that, with a Gauss-Markov source and Bernoulli-Gaussian channel noise, the BP outperforms the syndrome decoding and the OMP at higher noise levels. In the case of image transmission with channel bit errors, the BP outperforms the other two decoding algorithms consistently.
本文研究了稀疏逼近在联合信源信道编码框架中的应用。所考虑的JSC编码系统在对信号进行量化和进一步处理之前,对输入信号采用实数BCH码。在脉冲信道噪声模型下,误差的解码是一个稀疏逼近问题。在均方重构误差方面,将正交匹配追踪(OMP)和基追踪(BP)算法与综合征解码算法进行了比较。可以看出,在具有高斯-马尔可夫源和伯努利-高斯信道噪声的情况下,BP在更高噪声水平下优于综合征解码和OMP。在有信道误码的图像传输情况下,BP算法始终优于其他两种译码算法。
{"title":"Sparse approximations for joint source-channel coding","authors":"G. Rath, C. Guillemot, J. Fuchs","doi":"10.1109/MMSP.2008.4665126","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665126","url":null,"abstract":"This paper considers the application of sparse approximations in a joint source-channel (JSC) coding framework. The considered JSC coded system employs a real number BCH code on the input signal before the signal is quantized and further processed. Under an impulse channel noise model, the decoding of error is posed as a sparse approximation problem. The orthogonal matching pursuit (OMP) and basis pursuit (BP) algorithms are compared with the syndrome decoding algorithm in terms of mean square reconstruction error. It is seen that, with a Gauss-Markov source and Bernoulli-Gaussian channel noise, the BP outperforms the syndrome decoding and the OMP at higher noise levels. In the case of image transmission with channel bit errors, the BP outperforms the other two decoding algorithms consistently.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123996434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Singular block Toeplitz matrix approximation and application to multi-microphone speech dereverberation 奇异块Toeplitz矩阵逼近及其在多麦克风语音去噪中的应用
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665048
Samir-Mohamad Omar, D. Slock
We consider the blind multichannel dereverberation problem for a single source. We have shown before [5] that the single-input multi-output (SIMO) reverberation filter can be equalized blindly by applying MIMO Linear Prediction (LP) to its output (after SISO input pre-whitening). In this paper, we investigate the LP-based dereverberation in a noisy environment, and/or under acoustic channel length underestimation. Considering ambient noise and late reverberation as additive noises, we propose to introduce a postfilter that transforms the MIMO prediction filter into a somewhat longer equalizer. The postfilter allows to equalize to non-zero delay. Both MMSE-ZF and MMSE design criteria are considered here for the postfilter.We also focus here on computationally efficient (FFT based) block Toeplitz covariance matrix enhancement that enforces the SIMO filtered source plus white noise structure before applying MIMO LP. A second suggested refinement is an iterative refinement between SISO and MIMO LP. Simulations show that the proposed scheme is robust in noisy environments, and performs better compared to the classic Delay-&-Predict equalizer and the Delay-&-Sum beamformer.
研究了单声源下的盲多信道去噪问题。我们之前已经表明[5],单输入多输出(SIMO)混响滤波器可以通过对其输出(在SISO输入预白化之后)应用MIMO线性预测(LP)来盲目均衡。在本文中,我们研究了在噪声环境和/或声通道长度低估下基于lp的去噪。考虑到环境噪声和后期混响作为加性噪声,我们建议引入一个后滤波器,将MIMO预测滤波器转换为一个稍长的均衡器。后置滤波器允许将延迟均衡为非零延迟。对于后滤波器,这里考虑了MMSE- zf和MMSE设计标准。我们还将重点放在计算效率(基于FFT)的块Toeplitz协方差矩阵增强上,该增强在应用MIMO LP之前加强SIMO滤波源和白噪声结构。第二个建议的改进是在SISO和MIMO LP之间进行迭代改进。仿真结果表明,该方案在噪声环境下具有较强的鲁棒性,与传统的延迟&预测均衡器和延迟&和波束形成器相比,具有更好的性能。
{"title":"Singular block Toeplitz matrix approximation and application to multi-microphone speech dereverberation","authors":"Samir-Mohamad Omar, D. Slock","doi":"10.1109/MMSP.2008.4665048","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665048","url":null,"abstract":"We consider the blind multichannel dereverberation problem for a single source. We have shown before [5] that the single-input multi-output (SIMO) reverberation filter can be equalized blindly by applying MIMO Linear Prediction (LP) to its output (after SISO input pre-whitening). In this paper, we investigate the LP-based dereverberation in a noisy environment, and/or under acoustic channel length underestimation. Considering ambient noise and late reverberation as additive noises, we propose to introduce a postfilter that transforms the MIMO prediction filter into a somewhat longer equalizer. The postfilter allows to equalize to non-zero delay. Both MMSE-ZF and MMSE design criteria are considered here for the postfilter.We also focus here on computationally efficient (FFT based) block Toeplitz covariance matrix enhancement that enforces the SIMO filtered source plus white noise structure before applying MIMO LP. A second suggested refinement is an iterative refinement between SISO and MIMO LP. Simulations show that the proposed scheme is robust in noisy environments, and performs better compared to the classic Delay-&-Predict equalizer and the Delay-&-Sum beamformer.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116736593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Adaptive Multiple Experts System for personal identification using facial behaviour biometrics 基于面部行为生物识别的自适应多专家系统
Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665158
Pohsiang Tsai, Tich Phuoc Tran, T. Hintz, T. Jan
Physiological and/or behavioural characteristics of humans such as face, gait and/or voice have been used in biometric recognition technology. Apart from these characteristics (which have been reported in the literature), the hypothesis of this research was to investigate if facial behaviour could be used for human identification. We analysed and proposed a multiple experts system, called Adaptive Multiple Experts System (AMES), for validating our hypothesis and analysis. We used the Japanese Female Facial Expression (JAFFE) database as it provides the facial behaviour traits for data collection. The experimental results indicate that facial behaviours may provide information about individual difference and, thus may be used as another behavioural biometric.
人类的生理和/或行为特征,如面部、步态和/或声音,已被用于生物识别技术。除了这些特征(已经在文献中报道),这项研究的假设是调查面部行为是否可以用于人类识别。我们分析并提出了一个多专家系统,称为自适应多专家系统(AMES),用于验证我们的假设和分析。我们使用了日本女性面部表情(JAFFE)数据库,因为它为数据收集提供了面部行为特征。实验结果表明,面部行为可以提供关于个体差异的信息,因此可以用作另一种行为生物特征。
{"title":"Adaptive Multiple Experts System for personal identification using facial behaviour biometrics","authors":"Pohsiang Tsai, Tich Phuoc Tran, T. Hintz, T. Jan","doi":"10.1109/MMSP.2008.4665158","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665158","url":null,"abstract":"Physiological and/or behavioural characteristics of humans such as face, gait and/or voice have been used in biometric recognition technology. Apart from these characteristics (which have been reported in the literature), the hypothesis of this research was to investigate if facial behaviour could be used for human identification. We analysed and proposed a multiple experts system, called Adaptive Multiple Experts System (AMES), for validating our hypothesis and analysis. We used the Japanese Female Facial Expression (JAFFE) database as it provides the facial behaviour traits for data collection. The experimental results indicate that facial behaviours may provide information about individual difference and, thus may be used as another behavioural biometric.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115201708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2008 IEEE 10th Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1