首页 > 最新文献

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
Speech emotion recognition using transfer non-negative matrix factorization 基于迁移非负矩阵分解的语音情感识别
Peng Song, S. Ou, Wenming Zheng, Yun Jin, Li Zhao
In practical situations, the emotional speech utterances are often collected from different devices and conditions, which will obviously affect the recognition performance. To address this issue, in this paper, a novel transfer non-negative matrix factorization (TNMF) method is presented for cross-corpus speech emotion recognition. First, the NMF algorithm is adopted to learn a latent common feature space for the source and target datasets. Then, the discrepancies between the feature distributions of different corpora are considered, and the maximum mean discrepancy (MMD) algorithm is used for the similarity measurement. Finally, the TNMF approach, which integrates the NMF and MMD algorithms, is proposed. Experiments are carried out on two popular datasets, and the results verify that the TNMF method can significantly outperform the automatic and competitive methods for cross-corpus speech emotion recognition.
在实际情境中,情感言语话语往往是在不同的设备和条件下采集的,这将明显影响识别性能。针对这一问题,本文提出了一种新的跨语料库语音情感识别的迁移非负矩阵分解(TNMF)方法。首先,采用NMF算法学习源数据集和目标数据集的潜在公共特征空间;然后,考虑不同语料库特征分布之间的差异,采用最大平均差异(MMD)算法进行相似度度量;最后,提出了融合NMF和MMD算法的TNMF方法。在两个流行的数据集上进行了实验,结果验证了TNMF方法在跨语料库语音情感识别方面明显优于自动和竞争方法。
{"title":"Speech emotion recognition using transfer non-negative matrix factorization","authors":"Peng Song, S. Ou, Wenming Zheng, Yun Jin, Li Zhao","doi":"10.1109/ICASSP.2016.7472665","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472665","url":null,"abstract":"In practical situations, the emotional speech utterances are often collected from different devices and conditions, which will obviously affect the recognition performance. To address this issue, in this paper, a novel transfer non-negative matrix factorization (TNMF) method is presented for cross-corpus speech emotion recognition. First, the NMF algorithm is adopted to learn a latent common feature space for the source and target datasets. Then, the discrepancies between the feature distributions of different corpora are considered, and the maximum mean discrepancy (MMD) algorithm is used for the similarity measurement. Finally, the TNMF approach, which integrates the NMF and MMD algorithms, is proposed. Experiments are carried out on two popular datasets, and the results verify that the TNMF method can significantly outperform the automatic and competitive methods for cross-corpus speech emotion recognition.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128410835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Super-resolution DOA estimation via continuous group sparsity in the covariance domain 基于协方差域连续群稀疏度的超分辨DOA估计
Cheng-Yu Hung, M. Kaveh
Estimation of directions-of-arrival (DoA) in the spatial co-variance model is studied. Unlike the compressed sensing methods which discretize the search domain into possible directions on a grid, the theory of super resolution is applied to estimate DoAs in the continuous domain. We reformulate the spatial spectral covariance model into a Multiple Measurement Vector (MMV)-like model, and propose a block total variation norm minimization approach, which is the analog of Group Lasso in the super-resolution framework and that promotes the group-sparsity. The DoAs can be estimated by solving its dual problem via semidefinite programming. This gridless recovery approach is verified by simulation results for both uncorrelated and correlated source signals.
研究了空间协方差模型中到达方向的估计。与压缩感知方法将搜索域离散到网格上的可能方向不同,该方法采用超分辨率理论来估计连续域的doa。将空间谱协方差模型重构为一种类似于多测量向量(Multiple Measurement Vector, MMV)的模型,并提出了一种块总变差范数最小化方法,该方法模拟了超分辨率框架下的群Lasso方法,提高了群稀疏性。通过半定规划求解其对偶问题,可以估计出doa。仿真结果验证了该方法对不相关和相关源信号的无网格恢复效果。
{"title":"Super-resolution DOA estimation via continuous group sparsity in the covariance domain","authors":"Cheng-Yu Hung, M. Kaveh","doi":"10.1109/ICASSP.2016.7472239","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472239","url":null,"abstract":"Estimation of directions-of-arrival (DoA) in the spatial co-variance model is studied. Unlike the compressed sensing methods which discretize the search domain into possible directions on a grid, the theory of super resolution is applied to estimate DoAs in the continuous domain. We reformulate the spatial spectral covariance model into a Multiple Measurement Vector (MMV)-like model, and propose a block total variation norm minimization approach, which is the analog of Group Lasso in the super-resolution framework and that promotes the group-sparsity. The DoAs can be estimated by solving its dual problem via semidefinite programming. This gridless recovery approach is verified by simulation results for both uncorrelated and correlated source signals.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128701250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A parameter-free Cauchy-Schwartz information measure for independent component analysis 独立分量分析的无参数Cauchy-Schwartz信息测度
Lei Sun, Badong Chen, K. Toh, Zhiping Lin
Independent component analysis (ICA) by an information measure has seen wide applications in engineering. Different from traditional probability density function based information measures, a probability survival distribution based Cauchy-Schwartz information measure for multiple variables is proposed in this paper. Empirical estimation of survival distribution is parameter-free which is inherited by the estimation of the new information measure. This measure is proved to be a valid statistical independence measure and is adopted as an objective function to develop an ICA algorithm which is validated by an experiment. This work shows promising potential regarding the use of survival distribution based information measure for ICA.
基于信息测度的独立分量分析(ICA)在工程中有着广泛的应用。与传统的基于概率密度函数的信息测度不同,本文提出了一种基于概率生存分布的多变量Cauchy-Schwartz信息测度。生存分布的经验估计是无参数的,由新信息测度的估计继承。该度量被证明是一种有效的统计独立性度量,并作为目标函数开发了ICA算法,并通过实验验证了该算法的有效性。这项工作显示了在ICA中使用基于生存分布的信息度量的巨大潜力。
{"title":"A parameter-free Cauchy-Schwartz information measure for independent component analysis","authors":"Lei Sun, Badong Chen, K. Toh, Zhiping Lin","doi":"10.1109/ICASSP.2016.7472132","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472132","url":null,"abstract":"Independent component analysis (ICA) by an information measure has seen wide applications in engineering. Different from traditional probability density function based information measures, a probability survival distribution based Cauchy-Schwartz information measure for multiple variables is proposed in this paper. Empirical estimation of survival distribution is parameter-free which is inherited by the estimation of the new information measure. This measure is proved to be a valid statistical independence measure and is adopted as an objective function to develop an ICA algorithm which is validated by an experiment. This work shows promising potential regarding the use of survival distribution based information measure for ICA.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128272695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient stochastic detector for large-scale MIMO 大规模MIMO的高效随机检测器
Junmei Yang, Chuan Zhang, Shugong Xu, X. You
In this paper, a low-complexity stochastic belief propagation (BP) detector for large-scale MIMO is first proposed. Its efficient hardware architecture, with parallel pipeline, is presented in detail. Thanks to the stochastic approach, all arithmetic operations of the detector are implemented with simple logic structures. Several approaches which can potentially improve the detection performance are exploited. Simulation results have demonstrated that the stochastic BP detector can achieve similar detection performance compared with deterministic one for 32 × 32 MIMO system with 4-quadrature amplitude modulation (4-QAM). With the increase of antenna number, the detection performance improves at the linear expense of complexity and latency. Therefore, the proposed stochastic BP detector is suitable for large-scale MIMO system applications with good balance of detection performance and implementation complexity.
提出了一种适用于大规模MIMO的低复杂度随机信念传播(BP)检测器。详细介绍了其采用并行流水线的高效硬件结构。由于采用随机方法,探测器的所有算术运算都用简单的逻辑结构实现。利用了几种可能提高检测性能的方法。仿真结果表明,对于4正交调幅(4-QAM)的32 × 32 MIMO系统,随机BP检测器的检测性能与确定性BP检测器相当。随着天线数量的增加,检测性能的提高是以复杂度和延迟为线性代价的。因此,所提出的随机BP检测器具有较好的检测性能和实现复杂度平衡,适合大规模MIMO系统应用。
{"title":"Efficient stochastic detector for large-scale MIMO","authors":"Junmei Yang, Chuan Zhang, Shugong Xu, X. You","doi":"10.1109/ICASSP.2016.7472939","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472939","url":null,"abstract":"In this paper, a low-complexity stochastic belief propagation (BP) detector for large-scale MIMO is first proposed. Its efficient hardware architecture, with parallel pipeline, is presented in detail. Thanks to the stochastic approach, all arithmetic operations of the detector are implemented with simple logic structures. Several approaches which can potentially improve the detection performance are exploited. Simulation results have demonstrated that the stochastic BP detector can achieve similar detection performance compared with deterministic one for 32 × 32 MIMO system with 4-quadrature amplitude modulation (4-QAM). With the increase of antenna number, the detection performance improves at the linear expense of complexity and latency. Therefore, the proposed stochastic BP detector is suitable for large-scale MIMO system applications with good balance of detection performance and implementation complexity.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128381289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Generalized k-level cutset sampling and reconstruction 广义k级割集采样与重构
Shengxin Zha, T. Pappas
We propose a family of cutset sampling schemes and a generalized k-level image reconstruction approach formulated under a minimum mean squared error (MMSE) framework. The k-level reconstruction approach is a direct generalization of the recently proposed pattern-based approach, and can be applied to periodic samples either on a cutset or on a grid. Our experimental results indicate that the generalization of the k-level reconstruction approach results in only a small performance loss. For rectangular cutsets, we show that the proposed approach outperforms the cutset-MRF approach as well as two inpainting approaches. Moreover, we show that combining the cutset sampling with an additional point sample inside the periodic structure outperforms k-level reconstruction from cutset sampling and point sampling under comparable sampling densities.
我们提出了一组割集采样方案和在最小均方误差(MMSE)框架下制定的广义k级图像重建方法。k级重建方法是最近提出的基于模式的方法的直接推广,可以应用于割集或网格上的周期性样本。我们的实验结果表明,k级重构方法的泛化只导致很小的性能损失。对于矩形切割集,我们表明所提出的方法优于切割集- mrf方法以及两种涂漆方法。此外,我们表明,在相同的采样密度下,将割集采样与周期结构内的附加点样本相结合的k级重构效果优于割集采样和点采样。
{"title":"Generalized k-level cutset sampling and reconstruction","authors":"Shengxin Zha, T. Pappas","doi":"10.1109/ICASSP.2016.7471963","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471963","url":null,"abstract":"We propose a family of cutset sampling schemes and a generalized k-level image reconstruction approach formulated under a minimum mean squared error (MMSE) framework. The k-level reconstruction approach is a direct generalization of the recently proposed pattern-based approach, and can be applied to periodic samples either on a cutset or on a grid. Our experimental results indicate that the generalization of the k-level reconstruction approach results in only a small performance loss. For rectangular cutsets, we show that the proposed approach outperforms the cutset-MRF approach as well as two inpainting approaches. Moreover, we show that combining the cutset sampling with an additional point sample inside the periodic structure outperforms k-level reconstruction from cutset sampling and point sampling under comparable sampling densities.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128587887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Secure M-PSK communication via directional modulation 通过方向调制安全M-PSK通信
A. Kalantari, Mojtaba Soltanalian, S. Maleki, S. Chatzinotas, B. Ottersten
In this work, a directional modulation-based technique is devised to enhance the security of a multi-antenna wireless communication system employing M-PSK modulation to convey information. The directional modulation method operates by steering the array beam in such a way that the phase of the received signal at the receiver matches that of the intended M-PSK symbol. Due to the difference between the channels of the legitimate receiver and the eavesdropper, the signals received by the eavesdropper generally encompass a phase component different than the actual symbols. As a result, the transceiver which employs directional modulation can impose a high symbol error rate on the eavesdropper without requiring to know the eavesdropper's channel. The optimal directional modulation beamformer is designed to minimize the consumed power subject to satisfying a specific resulting phase and minimal signal amplitude at each antenna of the legitimate receiver. The simulation results show that the directional modulation results in a much higher symbol error rate at the eavesdropper compared to the conventional benchmark scheme, i.e., zero-forcing precoding at the transmitter.
在这项工作中,设计了一种基于方向调制的技术,以增强采用M-PSK调制传输信息的多天线无线通信系统的安全性。定向调制方法通过使接收机接收信号的相位与预定的M-PSK符号相匹配的方式操纵阵列波束来操作。由于合法接收者和窃听者的信道不同,窃听者接收到的信号通常包含与实际符号不同的相位分量。因此,采用定向调制的收发器可以在不知道窃听者信道的情况下对窃听者施加较高的符号误码率。最佳方向调制波束形成器的设计是在满足合法接收机的每个天线的特定相位和最小信号幅度的情况下,使消耗的功率最小。仿真结果表明,与传统的基准方案(即在发射机处强制零预编码)相比,定向调制在窃听端的误码率要高得多。
{"title":"Secure M-PSK communication via directional modulation","authors":"A. Kalantari, Mojtaba Soltanalian, S. Maleki, S. Chatzinotas, B. Ottersten","doi":"10.1109/ICASSP.2016.7472324","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472324","url":null,"abstract":"In this work, a directional modulation-based technique is devised to enhance the security of a multi-antenna wireless communication system employing M-PSK modulation to convey information. The directional modulation method operates by steering the array beam in such a way that the phase of the received signal at the receiver matches that of the intended M-PSK symbol. Due to the difference between the channels of the legitimate receiver and the eavesdropper, the signals received by the eavesdropper generally encompass a phase component different than the actual symbols. As a result, the transceiver which employs directional modulation can impose a high symbol error rate on the eavesdropper without requiring to know the eavesdropper's channel. The optimal directional modulation beamformer is designed to minimize the consumed power subject to satisfying a specific resulting phase and minimal signal amplitude at each antenna of the legitimate receiver. The simulation results show that the directional modulation results in a much higher symbol error rate at the eavesdropper compared to the conventional benchmark scheme, i.e., zero-forcing precoding at the transmitter.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129324362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Reconstructing non-point sources of diffusion fields using sensor measurements 利用传感器测量重建扩散场的非点源
John Murray-Bruce, P. Dragotti
We present a framework for estimating non-localized sources of diffusion fields using spatiotemporal measurements of the field. Specifically in this contribution, we consider two non-localized source types: straight line and polygonal sources and assume that the induced field is monitored using a sensor network. Given the sensor measurements, we demonstrate, for each non-point source parameterization, how to reduce the source estimation problem to a system governed by a power series expansion that can then be efficiently solved using Prony's method, in order to reconstruct the source. We then evaluate the proposed algorithms by performing some numerical simulations using both noiseless and noisy spatiotemporal sensor measurements of the field.
我们提出了一个框架,用于估计非局域源的扩散场使用的场的时空测量。具体来说,在本文中,我们考虑了两种非局域源类型:直线源和多边形源,并假设感应场是使用传感器网络监测的。给定传感器测量,我们演示了,对于每个非点源参数化,如何将源估计问题减少到一个由幂级数展开控制的系统,然后可以使用proony的方法有效地解决,以便重建源。然后,我们通过使用现场的无噪声和有噪声时空传感器测量进行一些数值模拟来评估所提出的算法。
{"title":"Reconstructing non-point sources of diffusion fields using sensor measurements","authors":"John Murray-Bruce, P. Dragotti","doi":"10.1109/ICASSP.2016.7472429","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472429","url":null,"abstract":"We present a framework for estimating non-localized sources of diffusion fields using spatiotemporal measurements of the field. Specifically in this contribution, we consider two non-localized source types: straight line and polygonal sources and assume that the induced field is monitored using a sensor network. Given the sensor measurements, we demonstrate, for each non-point source parameterization, how to reduce the source estimation problem to a system governed by a power series expansion that can then be efficiently solved using Prony's method, in order to reconstruct the source. We then evaluate the proposed algorithms by performing some numerical simulations using both noiseless and noisy spatiotemporal sensor measurements of the field.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129424200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS 基于语速的汉语TTS分层韵律模型的结构最大后验自适应
I-Bin Liao, Chen-Yu Chiang, Sin-Horng Chen
In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker's data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker's data, but also was much better in the unseen SR range.
本文讨论了一种结构最大后置说话人自适应方法,将现有的依赖于说话率(SR)的分层韵律模型(SR- hpm)调整为新说话人的数据,以实现任意给定SR的新语音。在MAP估计的基础上,以参考SR-HPM作为信息先验,建立了自适应SR-HPM。参考SR-HPM提供的先验信息通过决策树分层组织。客观和主观评价结果表明,该方法不仅在目标说话人数据的可见SR范围内略优于基于最大似然的模型,而且在未见SR范围内也明显优于基于最大似然的模型。
{"title":"Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS","authors":"I-Bin Liao, Chen-Yu Chiang, Sin-Horng Chen","doi":"10.1109/ICASSP.2016.7472754","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472754","url":null,"abstract":"In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker's data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker's data, but also was much better in the unseen SR range.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129667919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Geodesic-based pavement shadow removal revisited 基于测地线的路面阴影移除
Qin Zou, Zhongwen Hu, Long Chen, Qian Wang, Qingquan Li
Shadows often incur uneven illumination to pavement images, which brings great challenges to image-based pavement crack detection. Thus, it is desired to remove pavement shadows before detecting pavement cracks. However, due to the large penumbras cast by trees, light poles, etc., it is difficult to locate shadows in a pavement image. In this paper, an automatic pavement shadow removal method is proposed based on geodesic analysis. First, a geodesic shadow model is used to partition a pavement shadow into a number of geodesic regions. Then, an optimal background region is selected for reference by statistic analysis. Finally, a texture-balanced illuminance compensation is applied on all geodesic regions over the image. Experiments demonstrate the effectiveness of the proposed method.
阴影会导致路面图像光照不均匀,给基于图像的路面裂缝检测带来很大挑战。因此,在检测路面裂缝之前,需要先去除路面阴影。然而,由于树木、灯杆等投射的大半影,在路面图像中很难定位阴影。本文提出了一种基于测地线分析的路面阴影自动去除方法。首先,利用测地线阴影模型将路面阴影划分为多个测地线区域。然后通过统计分析选择一个最优的背景区域作为参考。最后,对图像上的所有测地线区域进行纹理平衡的照度补偿。实验证明了该方法的有效性。
{"title":"Geodesic-based pavement shadow removal revisited","authors":"Qin Zou, Zhongwen Hu, Long Chen, Qian Wang, Qingquan Li","doi":"10.1109/ICASSP.2016.7471979","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471979","url":null,"abstract":"Shadows often incur uneven illumination to pavement images, which brings great challenges to image-based pavement crack detection. Thus, it is desired to remove pavement shadows before detecting pavement cracks. However, due to the large penumbras cast by trees, light poles, etc., it is difficult to locate shadows in a pavement image. In this paper, an automatic pavement shadow removal method is proposed based on geodesic analysis. First, a geodesic shadow model is used to partition a pavement shadow into a number of geodesic regions. Then, an optimal background region is selected for reference by statistic analysis. Finally, a texture-balanced illuminance compensation is applied on all geodesic regions over the image. Experiments demonstrate the effectiveness of the proposed method.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129899152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Simplified learning with binary orthogonal constraints 用二元正交约束简化学习
Qiang Huang
Deep architecture based Deep Brief Nets (DBNs) has shown its data modelling power by stacking up several Restricted Boltzmann Machines (RBMs). However, the multiple-layer structure used in DBN brings expensive computation, and furthermore leads to slow convergence. This is because the pretraining stage is usually implemented in a data-driven way, and class information attached to the training data is only used for fine-tuning. In this paper, we aim to simplify a multiple-layer DBN to a one-layer structure. We use class information as a constraint to the hidden layer during pre-training. For each training instance and its corresponding class, a binary sequence will be generated in order to adapt the output of hidden layer. We test our approaches on four data sets: basic, MNIST, basic negative MNIST, rotation MNIST and rectangle (tall vs. wide rectangles). The obtained results show that the adapted one-layer structure can compete with a three-layer, DBN.
基于深度架构的深度简要网(Deep Brief Nets, dbn)通过叠加多个受限玻尔兹曼机(Restricted Boltzmann Machines, rbm)展示了其数据建模能力。然而,DBN采用多层结构,计算量大,收敛速度慢。这是因为预训练阶段通常以数据驱动的方式实现,附加到训练数据上的类信息仅用于微调。在本文中,我们的目标是将多层DBN简化为单层结构。在预训练过程中,我们使用类信息作为对隐藏层的约束。对于每个训练实例及其对应的类,将生成一个二值序列,以适应隐藏层的输出。我们在四个数据集上测试了我们的方法:基本、MNIST、基本负MNIST、旋转MNIST和矩形(高矩形与宽矩形)。结果表明,这种单层结构可以与三层DBN相媲美。
{"title":"Simplified learning with binary orthogonal constraints","authors":"Qiang Huang","doi":"10.1109/ICASSP.2016.7472177","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472177","url":null,"abstract":"Deep architecture based Deep Brief Nets (DBNs) has shown its data modelling power by stacking up several Restricted Boltzmann Machines (RBMs). However, the multiple-layer structure used in DBN brings expensive computation, and furthermore leads to slow convergence. This is because the pretraining stage is usually implemented in a data-driven way, and class information attached to the training data is only used for fine-tuning. In this paper, we aim to simplify a multiple-layer DBN to a one-layer structure. We use class information as a constraint to the hidden layer during pre-training. For each training instance and its corresponding class, a binary sequence will be generated in order to adapt the output of hidden layer. We test our approaches on four data sets: basic, MNIST, basic negative MNIST, rotation MNIST and rectangle (tall vs. wide rectangles). The obtained results show that the adapted one-layer structure can compete with a three-layer, DBN.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130484607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1