2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)最新文献

英文中文

Bilateral hemiface feature representation learning for pose robust facial expression recognition 面向姿态鲁棒性面部表情识别的双侧半脸特征表征学习

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-15 DOI: 10.1109/APSIPA.2016.7820781

Wissam J. Baddar, Yong Man Ro

We propose a bilateral hemiface feature representation learning via convolutional neural networks (CNNs) for pose robust facial expression recognition. The proposed method considers two characteristics of facial expressions. First, features from local patches are more robust to pose variations. Second, human faces are bilaterally symmetrical on left and right hemifaces. To incorporate those characteristics, a CNN is devised to learn feature representations from local patches. Then, feature representations are learned from each hemiface separately. To reduce the effect of self-occlusion, a shared feature representation is learned by combining both hemiface feature representations. The shared feature representation adaptively learns to utilize the hemiface feature representations according to the head pose. Experiments conducted on the Multi-PIE dataset showed that the proposed bilateral hemiface feature representation is pose robust and compares favorably to state-of-the-art methods.

我们提出了一种基于卷积神经网络(cnn)的双侧面部特征表征学习方法，用于姿态鲁棒性面部表情识别。该方法考虑了面部表情的两个特征。首先，来自局部补丁的特征对姿态变化更健壮。第二，人的左右脸是对称的。为了结合这些特征，CNN被设计成从局部补丁中学习特征表示。然后，分别从每个半面学习特征表示。为了减少自遮挡的影响，将两个半面特征表示结合起来学习共享特征表示。共享特征表示根据头部姿态自适应学习利用半脸特征表示。在Multi-PIE数据集上进行的实验表明，所提出的双侧半面特征表示具有鲁棒性，与现有方法相比具有优势。

引用次数: 1

Robust blind deconvolution for PMMW images with sparsity presentation 稀疏表示PMMW图像的鲁棒盲反卷积

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820680

Tingting Liu, Zengzhao Chen, Hai Liu, Sanya Liu, Zhaoli Zhang, Taihe Cao

Passive millimeter-wave images (PMMW) often suffer from issues such as low resolution, noise, and blurring. In this paper, we proposed a blind image deconvolution method for the passive millimeter-wave images. The purpose of the proposed method is to simultaneously solve the point spread function (PSF) and restoration image. In this method, the data fidelity item is constructed based on Gaussian noise assuming, and the regularization item is constructed as the hyper-Laplace function ‖x‖0.6, which is fitted according to the high-resolution PMMW images. Moreover, a data-selected matrix is proposed to select the regions that are helpful for estimating the accurate PSF. The proposed method has been applied to simulated and real PMMW image experiments. Comparative results demonstrate that the proposed method significantly outperforms the state-of-the-art deconvolution methods on both qualitative and quantitative assessments.

无源毫米波图像(PMMW)经常受到低分辨率、噪声和模糊等问题的困扰。针对无源毫米波图像，提出了一种盲图像反卷积方法。该方法的目的是同时求解点扩散函数(PSF)和恢复图像。在该方法中，基于高斯噪声假设构建数据保真度项，基于高分辨率PMMW图像拟合的超拉普拉斯函数‖x‖0.6构建正则化项。此外，提出了一个数据选择矩阵来选择有助于准确估计PSF的区域。该方法已应用于PMMW图像的仿真和真实实验。对比结果表明，所提出的方法在定性和定量评估上都明显优于最先进的反褶积方法。

引用次数: 2

Investigation of noun-verb dissociation based on EEG source reconstruction 基于脑电源重构的名动分离研究

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820817

Bin Zhao, J. Dang, Gaoyan Zhang

To clarify whether grammatical category or semantic meaning is the underlying determinant of noun-verb dissociation in the brain topography, this study recorded 128-channel electroencephalographic (EEG) signals from the scalps of 22 subjects when they listened to auditory (i) unambiguous nouns (UN), (ii) unambiguous verbs (UV), (iv) noun-biased ambiguous words (AN) and (v) verb-biased ambiguous words (AV). Then the current density source reconstruction algorithm with a standardized low-resolution electromagnetic tomography constraint was applied to the EEG signals to uncover the brain dynamics during the word processing. In our results, the noun-verb dissociation appeared in the periods of 150–250 ms and 380–450 ms, during which activation differences in the visual occipital cortex and motor frontal cortex were observed in both UN-UV and AN-AV contrasts. The results suggest that semantic differences might lead to the noun-verb dissociation.

为了弄清语法范畴还是语义意义是脑地形中名动分离的潜在决定因素，本研究记录了22名受试者在听(i)无歧义名词(UN)、(ii)无歧义动词(UV)、(iv)名词偏向歧义词(AN)和(v)动词偏向歧义词(AV)时的128通道脑电图(EEG)信号。在此基础上，采用标准化低分辨率电磁断层成像约束下的电流密度源重构算法对脑电信号进行重构，揭示字处理过程中的脑动态。在150 ~ 250 ms和380 ~ 450 ms期间出现了名词-动词分离，在此期间，UN-UV和AN-AV对比均观察到枕叶视觉皮层和运动额叶皮层的激活差异。结果表明，语义差异可能导致名动分离。

引用次数: 2

Quality preserving depth estimation in sequential stereo images 连续立体图像中保持质量的深度估计

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820869

Ji-Hun Mun, Yo-Sung Ho

Computational complexity of the local stereo matching method is affected by disparity ranges. In case of depth estimation in sequential stereo images, high computational complexity is a problem in terms of real-time processing. In this paper, we propose a temporal correlation based stereo matching method in sequential images. Using temporal information in a sequential stereo matching method provides inaccurate disparity ranges, since the estimated depth map accuracy is gradually degraded. To preserve the depth map quality in temporal stereo matching procedure, we adopt the guided image filtering for matching cost aggregation. Since the guided image filtering has a similar structure with bilateral filter, it preserves an object boundary region even in restricted disparity search ranges. Inaccurately estimated disparity values from the temporal correlation are compensated by filtering based cost aggregation method. From the experiment results, we check that the proposed depth map acquisition method preserves the depth map quality in temporal domain stereo matching.

局部立体匹配方法的计算复杂度受视差范围的影响。序列立体图像的深度估计，在实时处理方面存在计算复杂度高的问题。本文提出了一种基于时序相关的序列图像立体匹配方法。在时序立体匹配方法中使用时间信息提供了不准确的视差范围，因为估计的深度图精度会逐渐降低。为了保证时间立体匹配过程中深度图的质量，我们采用了引导图像滤波进行匹配代价聚合。由于制导图像滤波具有与双边滤波相似的结构，即使在视差搜索范围有限的情况下，也能保留目标边界区域。通过基于滤波的成本聚合方法补偿时间相关性中不准确估计的视差值。实验结果表明，本文提出的深度图获取方法在时域立体匹配中保持了深度图的质量。

引用次数: 0

Improving BLSTM RNN based Mandarin speech recognition using accent dependent bottleneck features 基于语音瓶颈特征的BLSTM RNN普通话语音识别改进

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820723

Jiangyan Yi, Hao Ni, Zhengqi Wen, J. Tao

This paper proposes an approach to perform accent adaptation by using accent dependent bottleneck (BN) features to improve the performance of multi-accent Mandarin speech recognition system. The architecture of the adaptation uses two neural networks. First, deep neural network (DNN) acoustic model acts as a feature extractor which is used to extract accent dependent BN (BN-DNN) features. The input features of the BN-DNN model are MFCC features appended with i-vectors features. Second, bidirectional long short term memory (BLSTM) recurrent neural network (RNN) based acoustic model is used to perform accent-specific adaptation. The input features of the BLSTM RNN model are accent dependent BN features appended with MFCC features. Experiments on RASC863 and CASIA regional accent speech corpus show that the proposed method obtains obvious improvement compared with the BLSTM RNN baseline model.

本文提出了一种利用口音依赖瓶颈(BN)特征进行口音自适应的方法，以提高多口音普通话语音识别系统的性能。自适应的结构采用两个神经网络。首先，深度神经网络(DNN)声学模型作为特征提取器，用于提取重音相关BN (BN-DNN)特征。BN-DNN模型的输入特征是MFCC特征加上i-vectors特征。其次，采用基于双向长短期记忆(BLSTM)递归神经网络(RNN)的声学模型进行口音自适应。BLSTM RNN模型的输入特征是与重音相关的BN特征加上MFCC特征。在RASC863和CASIA区域口音语音语料库上的实验表明，与BLSTM RNN基线模型相比，该方法得到了明显的改进。

引用次数: 8

Analysis of adaptation rate of the FXLMS algorithm FXLMS算法的自适应率分析

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820696

Kiyonori Terauchi, K. Motonaka, Y. Kajikawa, S. Miyoshi

We analyze the behaviors of active noise control using a statistical-mechanical method. The principal assumption used in the analysis is that the impulse responses of the primary path and adaptive filter are sufficiently long. In particular, in this paper we analyze the adaptation rate of the mean square error (MSE) using two measures. The first measure is the MSE initial decreasing rate. The second measure is an adaptation constant. This is defined by the negative of the maximum eigenvalue of the coefficient matrix of differential equations that describe the dynamical behaviors of the macroscopic variables. Introducing these two measures, we theoretically show that the optimal step size depends on whether we focus on the rate of decrease in the MSE at the initial stage or the MSE after sufficient adaptation time.

我们用统计力学方法分析了主动噪声控制的行为。分析中使用的主要假设是主路径和自适应滤波器的脉冲响应足够长。特别地，本文用两种方法分析了均方误差(MSE)的自适应率。第一个指标是MSE初始下降率。第二个指标是适应常数。这是由描述宏观变量动力学行为的微分方程的系数矩阵的最大特征值的负值来定义的。通过引入这两个度量，我们从理论上证明了最优步长取决于我们是关注初始阶段的MSE下降速率，还是关注足够适应期后的MSE下降速率。

引用次数: 0

A noise masking method with adaptive thresholds based on CASA 基于CASA的自适应阈值噪声掩蔽方法

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820880

Feng Bao, W. Abdulla

In this paper, we propose a novel noise masking method based on Computational Auditory Scene Analysis by using an adaptive factor. Although it has succeeded in the field of speech separation and speech enhancement to some extent, the usage of fixed thresholds used for segregation and labeling heavily affects the processing performance. Focusing on this issue, the proposed method utilizes the Normalized Cross-Correlation Coefficients between the power spectra of noisy speech and pure noise to find an adaptive threshold, so that the pitch contour and Time-Frequency units can be obtained more accurately. Then, a revised algorithm is used to smooth the current binary mask value by checking the Time-Frequency units within adjacent frames and neighbor channels around the current Time-Frequency unit in order to remove the erroneous local masks. Two kinds of Signal to Noise Ratio test results show that the performance of the proposed method outperforms conventional spectral subtractive, Wiener Filtering and Computational Auditory Scene Analysis methods.

本文提出了一种基于计算听觉场景分析的自适应噪声掩蔽方法。虽然它在语音分离和语音增强领域取得了一定的成功，但使用固定阈值进行分离和标记严重影响了处理性能。针对这一问题，本文提出的方法利用噪声语音功率谱与纯噪声功率谱的归一化互相关系数来寻找自适应阈值，从而更准确地获得基音轮廓和时频单元。然后，通过检查相邻帧内的时频单元和当前时频单元周围的相邻信道内的时频单元来平滑当前二进制掩码值，以去除错误的局部掩码。两种信噪比测试结果表明，该方法的性能优于传统的谱减法、维纳滤波和计算听觉场景分析方法。

引用次数: 3

Head pose estimation using random forest and texture analysis 基于随机森林和纹理分析的头部姿态估计

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820742

Min-Joo Kang, Hana Lee, Jewon Kang

In this paper, we propose a new head pose estimation technique based on Random Forest (RF) and Multi-scale Block Local Block Pattern (MB-LBP) features. In the proposed technique we aim to learn a randomized tree with useful attributes to improve the estimation accuracy and tolerance of occlusions and illumination. Precisely, a number of MB-LBP feature spaces are generated from a face image, and random inputs and random features such as the MB-LBP scale parameter and the block coordinate in the pool are used for building the tree. Furthermore we develop a split function considering the properties of the uniform LBP, applied to each internal node of the tree to maximize the information gain at that node. The randomized trees put together in RF are used for the final decision in a Maximum-A-Posteriori criterion. Experimental results demonstrate that the proposed technique provides impressive performance in the head pose estimation in various conditions of illumination, poses, expressions, and facial occlusions.

本文提出了一种基于随机森林(RF)和多尺度块局部块模式(MB-LBP)特征的头部姿态估计方法。在该技术中，我们的目标是学习具有有用属性的随机树，以提高估计精度和对遮挡和光照的容忍度。精确地说，从人脸图像中生成多个MB-LBP特征空间，并使用随机输入和随机特征(如MB-LBP尺度参数和池中的块坐标)来构建树。此外，考虑到均匀LBP的性质，我们开发了一个分裂函数，应用于树的每个内部节点，以最大化该节点的信息增益。随机树放在一起在RF中用于最大后验标准的最终决策。实验结果表明，该方法在各种光照、姿态、表情和面部遮挡条件下的头部姿态估计具有令人印象深刻的性能。

引用次数: 1

Canine emotional states assessment with heart rate variability 犬类情绪状态评估与心率变异性

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820868

Eri Nakahara, Yuki Maruno, Takatomi Kubo, Rina Ouchi, Maki Katayama, K. Fujiwara, M. Nagasawa, T. Kikusui, K. Ikeda

Emotions of a person affect the person's performance in a task and so do emotions of a rescue dog that works after disasters. Hence, estimating emotions of a rescue dog by the handler can improve its performance and welfare. Emotions also appear in physiological signals such as heart rate variability (HRV). In fact, HRV has information of emotions in both cases of human and dogs. To make emotion estimation more practical, we proposed a method for emotion estimation from HRV of dogs and evaluated its performance using real data. The method classified positive, negative, and neutral emotions with 88% accuracy within each subject and 72% over all subjects. These accuracies are high enough for practical use in rescue dogs.

人的情绪会影响人在任务中的表现，灾后救援犬的情绪也是如此。因此，训练者对救援犬的情绪进行评估可以提高救援犬的表现和福利。情绪也会出现在心率变异性(HRV)等生理信号中。事实上，HRV具有人类和狗的情感信息。为了使情绪估计更加实用，我们提出了一种基于狗HRV的情绪估计方法，并使用真实数据对其性能进行了评估。该方法对每个受试者的积极、消极和中性情绪进行分类，准确率为88%，对所有受试者的准确率为72%。这样的准确率对于救援犬的实际应用来说已经足够高了。

引用次数: 2

Fast HEVC screen content coding by skipping unnecessary checking of intra block copy mode based on CU activity and gradient 快速HEVC屏幕内容编码跳过不必要的检查内块复制模式基于CU活动和梯度

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Pub Date : 2016-12-01 DOI: 10.1109/APSIPA.2016.7820900

Sik-Ho Tsang, Wei Kuang, Yui-Lam Chan, W. Siu

The Intra Block Copy (IntraBC) mode is a very efficient coding tool for the screen content coding (SCC) extension in High Efficiency Video Coding (HEVC) by finding the repeating patterns within the same frame. Yet, it also brings along impractically high computational complexity for SCC, which can be a double of the conventional HEVC, as exhaustive block matching is done within the same frame even though there are already some constraints applied to the IntraBC mode. To reduce the complexity, we propose to skip the unnecessary IntraBC mode checking based on the activity and gradient within the coding unit (CU). With our proposed methods, the increased encoding time compared with the conventional HEVC is reduced from 90.0% to 62.2% on average while the coding efficiency can still be maintained with only negligible bitrate increased.

Intra Block Copy (IntraBC)模式是高效视频编码(HEVC)中屏幕内容编码(SCC)扩展的一种非常有效的编码工具，它可以在同一帧内找到重复的模式。然而，它也给SCC带来了不切实际的高计算复杂度，这可能是传统HEVC的两倍，因为尽管IntraBC模式已经有一些限制，但在同一帧内完成了详尽的块匹配。为了降低复杂性，我们建议跳过不必要的基于编码单元(CU)内的活动和梯度的IntraBC模式检查。与传统的HEVC相比，我们提出的方法将编码时间从90.0%平均降低到62.2%，而编码效率仅提高了微不足道的比特率。

引用次数: 15

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀