首页 > 最新文献

2015 International Joint Conference on Neural Networks (IJCNN)最新文献

英文 中文
Evaluation of optical flow field features for the detection of word prominence in a human-machine interaction scenario 评价人机交互场景中检测单词突出的光流场特征
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280639
Andrea Schnall, M. Heckmann
In this paper we investigate optical flow field features for the automatic labeling of word prominence. Visual motion is a rich source of information. Modifying the articulatory parameters to raise the prominence of a segment of an utterance, is usually accompanied by a stronger movement of mouth and head compared to a non-prominent segment. One way to describe such motion is to use optical flow fields. During the recording of the audio-visual database we used for the following experiments, the subjects were asked to make corrections for a misunderstanding of a single word of the system by using prosodic cues only, which created a narrow and a broad focus. Audio-visual recordings with a distant microphone and without visual markers were made. As acoustic features duration, loudness, fundamental frequency and spectral emphasis were calculated. From the visual channel the nose position is detected and the mouth region is extracted. From this region the optical flow is calculated and all the optical flow fields for one word are summed up. The pooled optical flow for the four directions is then used as feature vector. We demonstrate that using these features in addition to the audio features can improve the classification results for some speakers. We also compare the optical flow field features to other visual features, the nose position and image transformation based visual features. The optical flow field features incorporate not as much information as image transformation based visual features, but using both in addition to the audio features leads to the overall best results, which shows that they contain complementary information.
本文研究了用于单词突出自动标注的光流场特征。视觉运动是一个丰富的信息源。修改发音参数以提高话语中某一段的突出程度,通常伴随着嘴和头部的运动比不突出的部分更强烈。描述这种运动的一种方法是使用光流场。在记录我们用于后续实验的视听数据库的过程中,我们要求受试者只使用韵律线索来纠正系统中单个单词的误解,这创造了一个狭窄和广泛的焦点。用远处的麦克风进行视听记录,不做任何视觉标记。作为声学特征,计算了持续时间、响度、基频和频谱重点。从视觉通道中检测鼻子位置并提取口腔区域。从这个区域计算光流,总结出一个字的所有光流场。然后将四个方向的汇聚光流用作特征向量。我们证明,除了音频特征之外,使用这些特征可以改善一些说话者的分类结果。我们还将光流场特征与其他视觉特征、鼻子位置和基于图像变换的视觉特征进行了比较。光流场特征包含的信息不如基于图像变换的视觉特征多,但除了音频特征外,还使用这两种特征会导致总体上最好的结果,这表明它们包含互补的信息。
{"title":"Evaluation of optical flow field features for the detection of word prominence in a human-machine interaction scenario","authors":"Andrea Schnall, M. Heckmann","doi":"10.1109/IJCNN.2015.7280639","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280639","url":null,"abstract":"In this paper we investigate optical flow field features for the automatic labeling of word prominence. Visual motion is a rich source of information. Modifying the articulatory parameters to raise the prominence of a segment of an utterance, is usually accompanied by a stronger movement of mouth and head compared to a non-prominent segment. One way to describe such motion is to use optical flow fields. During the recording of the audio-visual database we used for the following experiments, the subjects were asked to make corrections for a misunderstanding of a single word of the system by using prosodic cues only, which created a narrow and a broad focus. Audio-visual recordings with a distant microphone and without visual markers were made. As acoustic features duration, loudness, fundamental frequency and spectral emphasis were calculated. From the visual channel the nose position is detected and the mouth region is extracted. From this region the optical flow is calculated and all the optical flow fields for one word are summed up. The pooled optical flow for the four directions is then used as feature vector. We demonstrate that using these features in addition to the audio features can improve the classification results for some speakers. We also compare the optical flow field features to other visual features, the nose position and image transformation based visual features. The optical flow field features incorporate not as much information as image transformation based visual features, but using both in addition to the audio features leads to the overall best results, which shows that they contain complementary information.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"36 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85757232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning rule for associative memory in recurrent neural networks 递归神经网络中联想记忆的学习规律
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280532
T. Jacob, W. Snyder
We present a new learning rule for intralayer connections in neural networks. The rule is based on Hebbian learning principles and is derived from information theoretic considerations. A simple network trained using the rule is shown to have associative memory like properties. The network acts by building connections between correlated data points, under constraints.
提出了一种新的神经网络层内连接学习规则。该规则基于Hebbian学习原理,并从信息论的考虑中得出。使用该规则训练的简单网络显示出类似于联想记忆的属性。网络通过在约束条件下建立相关数据点之间的连接来起作用。
{"title":"Learning rule for associative memory in recurrent neural networks","authors":"T. Jacob, W. Snyder","doi":"10.1109/IJCNN.2015.7280532","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280532","url":null,"abstract":"We present a new learning rule for intralayer connections in neural networks. The rule is based on Hebbian learning principles and is derived from information theoretic considerations. A simple network trained using the rule is shown to have associative memory like properties. The network acts by building connections between correlated data points, under constraints.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"62 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76607550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A multi-pheromone stigmergic distributed robot coordination strategy for fast surveillance task execution in unknown environments 面向未知环境下快速监视任务执行的多信息素分布式机器人协调策略
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280793
R. Calvo, A. A. Constantino, M. Figueiredo
A bio-inspired coordination strategy is improved aiming at reducing the time needed to mobile multiagent systems accomplish surveillance tasks. The original strategy is based on a modified version of the basic ant system algorithm. Only repulsive pheromone are considered in that version. The new strategy version uses other two kinds of pheromone, as well. Now the agents are able to mark strategic locations for reducing the path length. At the beginning the agents try any trajectory to accomplish the surveillance task. After many trials the agents choose preferably the paths that reduce the total length (time) needed to complete the surveillance task. Comparisons between the original and the extended coordination strategies are presented. Results show that the extended strategy achieves the best performance, that is, the surveillance tasks are accomplished in a period of time shorter than that needed by the original one in view of the scenarios examined.
为了减少移动多智能体系统完成监测任务所需的时间,改进了一种仿生协调策略。最初的策略是基于基本蚂蚁系统算法的修改版本。在那个版本中只考虑排斥性信息素。新的策略版本也使用了另外两种信息素。现在代理能够标记战略位置以减少路径长度。一开始,特工们尝试任何轨迹来完成监视任务。经过多次试验后,智能体优选减少完成监视任务所需的总长度(时间)的路径。对原协调策略和扩展协调策略进行了比较。结果表明,在所考察的场景中,扩展策略达到了最佳性能,即在较短的时间内完成了监视任务。
{"title":"A multi-pheromone stigmergic distributed robot coordination strategy for fast surveillance task execution in unknown environments","authors":"R. Calvo, A. A. Constantino, M. Figueiredo","doi":"10.1109/IJCNN.2015.7280793","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280793","url":null,"abstract":"A bio-inspired coordination strategy is improved aiming at reducing the time needed to mobile multiagent systems accomplish surveillance tasks. The original strategy is based on a modified version of the basic ant system algorithm. Only repulsive pheromone are considered in that version. The new strategy version uses other two kinds of pheromone, as well. Now the agents are able to mark strategic locations for reducing the path length. At the beginning the agents try any trajectory to accomplish the surveillance task. After many trials the agents choose preferably the paths that reduce the total length (time) needed to complete the surveillance task. Comparisons between the original and the extended coordination strategies are presented. Results show that the extended strategy achieves the best performance, that is, the surveillance tasks are accomplished in a period of time shorter than that needed by the original one in view of the scenarios examined.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"51 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85506946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A hierarchical SVM based multiclass classification by using similarity clustering 基于相似性聚类的分层支持向量机多类分类
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280489
Chao Dong, Bo Zhou, Jinglu Hu
This paper presents a new strategy to build multi tree hierarchical structure SVM which can get a more efficient and accuracy classification model for multiclass problems. Base on the theory of Binary Tree SVM (BTS), we proposed an improvement algorithm which extend binary tree structure to a multi tree structure, In the multi tree hierarchical structure, similarity clustering method was proposed to cluster classes to groups in each non-leaf node. In order to get a multi node division, one-against-all (OAA) was applied to train those groups rather than classes. The proposed method can avoid data imbalanced problem occurred in OAA, also the classification area of classifier in the upper layer is larger than classifier in lower layer. Compared with other several well-known methods, experiments on many data sets demonstrate that our method can reduce the number of classifiers in the testing phase and get a higher accuracy.
本文提出了一种构建多树层次结构支持向量机的新策略,可以得到更高效、更准确的多类问题分类模型。基于二叉树支持向量机(BTS)的理论,提出了一种改进算法,将二叉树结构扩展到多树结构,在多树层次结构中,提出了相似聚类方法,将类聚到每个非叶节点上的组。为了得到多节点划分,采用单对全(one-against-all, OAA)方法来训练这些组而不是类。该方法可以避免OAA中出现的数据不平衡问题,并且上层分类器的分类面积大于下层分类器。在大量数据集上的实验表明,与其他几种知名方法相比,我们的方法可以减少测试阶段的分类器数量,并获得更高的准确率。
{"title":"A hierarchical SVM based multiclass classification by using similarity clustering","authors":"Chao Dong, Bo Zhou, Jinglu Hu","doi":"10.1109/IJCNN.2015.7280489","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280489","url":null,"abstract":"This paper presents a new strategy to build multi tree hierarchical structure SVM which can get a more efficient and accuracy classification model for multiclass problems. Base on the theory of Binary Tree SVM (BTS), we proposed an improvement algorithm which extend binary tree structure to a multi tree structure, In the multi tree hierarchical structure, similarity clustering method was proposed to cluster classes to groups in each non-leaf node. In order to get a multi node division, one-against-all (OAA) was applied to train those groups rather than classes. The proposed method can avoid data imbalanced problem occurred in OAA, also the classification area of classifier in the upper layer is larger than classifier in lower layer. Compared with other several well-known methods, experiments on many data sets demonstrate that our method can reduce the number of classifiers in the testing phase and get a higher accuracy.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"53 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77929769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Strategic approach for Multiple-MLP Ensemble Re-RX algorithm 多mlp集成Re-RX算法的策略方法
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280387
Y. Hayashi, Shota Fujisawa
In this paper, we review all our work since 2012 and propose a strategic approach for the Multiple-MLP Ensemble Re- RX algorithm. We first describe the background and procedures of the Recursive-Rule Extraction (Re-RX) algorithm family and its variants, including the Multiple-MLP Ensemble Re-RX algorithm (“Multiple-MLP Ensemble”), which uses the Re-RX algorithm as its core. The proposed strategic approach consists of two processes: non-pruning for the trained neural network ensembles without continuous attributes and a relaxed rule generation scheme using continuous attributes to extract extremely accurate, comprehensible, and concise rules for multi-class mixed datasets (i.e., discrete attributes and continuous attributes). We conducted experiments to find rules for seven kinds of multi-class mixed datasets and compared the accuracy, comprehensibility, and conciseness for the Multiple-MLP Ensemble Re-RX algorithm. The strategic approach for the Multiple-MLP Ensemble Re-RX algorithm outperformed the original Multiple-MLP Ensemble Re- RX algorithm. These results confirm that the strategic approach for the Multiple-MLP Ensemble algorithm facilitates the migration from existing data systems toward new accurate analytic systems and Big Data.
在本文中,我们回顾了自2012年以来的所有工作,并提出了一种多mlp集成Re- RX算法的策略方法。我们首先描述了递归规则提取(Re-RX)算法家族及其变体的背景和过程,包括以Re-RX算法为核心的Multiple-MLP Ensemble Re-RX算法(“Multiple-MLP Ensemble”)。提出的策略方法包括两个过程:对没有连续属性的神经网络集合进行非修剪训练,以及使用连续属性对多类混合数据集(即离散属性和连续属性)提取极其准确、可理解和简明的规则的宽松规则生成方案。通过实验寻找7种多类混合数据集的规则,比较了multi- mlp Ensemble Re-RX算法的准确性、可理解性和简洁性。本文提出的multi - mlp Ensemble Re-RX算法优于原multi - mlp Ensemble Re-RX算法。这些结果证实了Multiple-MLP Ensemble算法的策略方法有助于从现有数据系统向新的精确分析系统和大数据的迁移。
{"title":"Strategic approach for Multiple-MLP Ensemble Re-RX algorithm","authors":"Y. Hayashi, Shota Fujisawa","doi":"10.1109/IJCNN.2015.7280387","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280387","url":null,"abstract":"In this paper, we review all our work since 2012 and propose a strategic approach for the Multiple-MLP Ensemble Re- RX algorithm. We first describe the background and procedures of the Recursive-Rule Extraction (Re-RX) algorithm family and its variants, including the Multiple-MLP Ensemble Re-RX algorithm (“Multiple-MLP Ensemble”), which uses the Re-RX algorithm as its core. The proposed strategic approach consists of two processes: non-pruning for the trained neural network ensembles without continuous attributes and a relaxed rule generation scheme using continuous attributes to extract extremely accurate, comprehensible, and concise rules for multi-class mixed datasets (i.e., discrete attributes and continuous attributes). We conducted experiments to find rules for seven kinds of multi-class mixed datasets and compared the accuracy, comprehensibility, and conciseness for the Multiple-MLP Ensemble Re-RX algorithm. The strategic approach for the Multiple-MLP Ensemble Re-RX algorithm outperformed the original Multiple-MLP Ensemble Re- RX algorithm. These results confirm that the strategic approach for the Multiple-MLP Ensemble algorithm facilitates the migration from existing data systems toward new accurate analytic systems and Big Data.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"37 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78060727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A method for finding similarity between multi-layer perceptrons by Forward Bipartite Alignment 一种基于前向二部对齐的多层感知器相似性查找方法
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280769
Stephen C. Ashmore, Michael S. Gashler
We present Forward Bipartite Alignment (FBA), a method that aligns the topological structures of two neural networks. Neural networks are considered to be a black box, because neural networks have a complex model surface determined by their weights that combine attributes non-linearly. Two networks that make similar predictions on training data may still generalize differently. FBA enables a diversity of applications, including visualization and canonicalization of neural networks, ensembles, and cross-over between unrelated neural networks in evolutionary optimization. We describe the FBA algorithm, and describe implementations for three applications: genetic algorithms, visualization, and ensembles. We demonstrate FBA's usefulness by comparing a bag of neural networks to a bag of FBA-aligned neural networks. We also show that aligning, and then combining two neural networks has no appreciable loss in accuracy which means that Forward Bipartite Alignment aligns neural networks in a meaningful way.
我们提出了前向二部对齐(FBA),一种对齐两个神经网络拓扑结构的方法。神经网络被认为是一个黑盒,因为神经网络有一个复杂的模型表面,由它们非线性组合属性的权重决定。两个对训练数据做出类似预测的网络可能仍然会有不同的概括。FBA支持多种应用,包括神经网络的可视化和规范化、集成以及在进化优化中不相关神经网络之间的交叉。我们描述了FBA算法,并描述了三种应用的实现:遗传算法、可视化和集成。我们通过比较一组神经网络和一组与FBA对齐的神经网络来证明FBA的有用性。我们还表明,对准然后组合两个神经网络在精度上没有明显的损失,这意味着前向二部对齐以有意义的方式对齐神经网络。
{"title":"A method for finding similarity between multi-layer perceptrons by Forward Bipartite Alignment","authors":"Stephen C. Ashmore, Michael S. Gashler","doi":"10.1109/IJCNN.2015.7280769","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280769","url":null,"abstract":"We present Forward Bipartite Alignment (FBA), a method that aligns the topological structures of two neural networks. Neural networks are considered to be a black box, because neural networks have a complex model surface determined by their weights that combine attributes non-linearly. Two networks that make similar predictions on training data may still generalize differently. FBA enables a diversity of applications, including visualization and canonicalization of neural networks, ensembles, and cross-over between unrelated neural networks in evolutionary optimization. We describe the FBA algorithm, and describe implementations for three applications: genetic algorithms, visualization, and ensembles. We demonstrate FBA's usefulness by comparing a bag of neural networks to a bag of FBA-aligned neural networks. We also show that aligning, and then combining two neural networks has no appreciable loss in accuracy which means that Forward Bipartite Alignment aligns neural networks in a meaningful way.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"15 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78107397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Integration of articulatory knowledge and voicing features based on DNN/HMM for Mandarin speech recognition 基于深度神经网络/HMM的普通话语音识别中发音知识与语音特征的整合
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280396
Ying-Wei Tan, Wenju Liu, Wei Jiang, Hao Zheng
Speech production knowledge has been used to enhance the phonetic representation and the performance of automatic speech recognition (ASR) systems successfully. Representations of speech production make simple explanations for many phenomena observed in speech. These phenomena can not be easily analyzed from either acoustic signal or phonetic transcription alone. One of the most important aspects of speech production knowledge is the use of articulatory knowledge, which describes the smooth and continuous movements in the vocal tract. In this paper, we present a new articulatory model to provide available information for rescoring the speech recognition lattice hypothesis. The articulatory model consists of a feature front-end, which computes a voicing feature based on a spectral harmonics correlation (SHC) function, and a back-end based on the combination of deep neural networks (DNNs) and hidden Markov models (HMMs). The voicing features are incorporated with standard Mel frequency cepstral coefficients (MFCCs) using heteroscedastic linear discriminant analysis (HLDA) to compensate the speech recognition accuracy rates. Moreover, the advantages of two different models are taken into account by the algorithm, which retains deep learning properties of DNNs, while modeling the articulatory context powerfully through HMMs. Mandarin speech recognition experiments show the proposed method achieves significant improvements in speech recognition performance over the system using MFCCs alone.
语音生成知识已被成功地应用于语音自动识别系统的语音表示和性能提升。言语产生表征对言语中观察到的许多现象作出了简单的解释。这些现象单从声学信号或音标分析都不容易。语音产生知识的一个最重要的方面是发音知识的使用,它描述了声道中流畅和连续的运动。在本文中,我们提出了一个新的发音模型,为重新记录语音识别晶格假设提供了可用的信息。该发音模型包括基于谱谐波相关(SHC)函数计算语音特征的特征前端和基于深度神经网络(dnn)和隐马尔可夫模型(hmm)相结合的后端。利用异方差线性判别分析(HLDA)将语音特征与标准Mel频率倒谱系数(MFCCs)相结合,对语音识别准确率进行补偿。此外,该算法考虑了两种不同模型的优点,保留了dnn的深度学习特性,同时通过hmm对发音上下文进行了强大的建模。普通话语音识别实验表明,与单独使用mfc的系统相比,该方法在语音识别性能上有显著提高。
{"title":"Integration of articulatory knowledge and voicing features based on DNN/HMM for Mandarin speech recognition","authors":"Ying-Wei Tan, Wenju Liu, Wei Jiang, Hao Zheng","doi":"10.1109/IJCNN.2015.7280396","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280396","url":null,"abstract":"Speech production knowledge has been used to enhance the phonetic representation and the performance of automatic speech recognition (ASR) systems successfully. Representations of speech production make simple explanations for many phenomena observed in speech. These phenomena can not be easily analyzed from either acoustic signal or phonetic transcription alone. One of the most important aspects of speech production knowledge is the use of articulatory knowledge, which describes the smooth and continuous movements in the vocal tract. In this paper, we present a new articulatory model to provide available information for rescoring the speech recognition lattice hypothesis. The articulatory model consists of a feature front-end, which computes a voicing feature based on a spectral harmonics correlation (SHC) function, and a back-end based on the combination of deep neural networks (DNNs) and hidden Markov models (HMMs). The voicing features are incorporated with standard Mel frequency cepstral coefficients (MFCCs) using heteroscedastic linear discriminant analysis (HLDA) to compensate the speech recognition accuracy rates. Moreover, the advantages of two different models are taken into account by the algorithm, which retains deep learning properties of DNNs, while modeling the articulatory context powerfully through HMMs. Mandarin speech recognition experiments show the proposed method achieves significant improvements in speech recognition performance over the system using MFCCs alone.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"50 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73363817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification 用于图像分类的粗到精训练多尺度卷积神经网络
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280542
Haobin Dou, Xihong Wu
Convolutional Neural Networks (CNNs) have become forceful models in feature learning and image classification. They achieve translation invariance by spatial convolution and pooling mechanisms, while their ability in scale invariance is limited. To tackle the problem of scale variation in image classification, this work proposed a multi-scale CNN model with depth-decreasing multi-column structure. Input images were decomposed into multiple scales and at each scale image, a CNN column was instantiated with its depth decreasing from fine to coarse scale for model simplification. Scale-invariant features were learned by weights shared across all scales and pooled among adjacent scales. Particularly, a coarse-to-fine pre-training method imitating the human's development of spatial frequency perception was proposed to train this multi-scale CNN, which accelerated the training process and reduced the classification error. In addition, model averaging technique was used to combine models obtained during pre-training and further improve the performance. With these methods, our model achieved classification errors of 15.38% on CIFAR-10 dataset and 41.29% on CIFAR-100 dataset, i.e. 1.05% and 2.97% reduction compared with single-scale CNN model.
卷积神经网络(cnn)已经成为特征学习和图像分类的有力模型。它们通过空间卷积和池化机制实现平移不变性,但尺度不变性能力有限。为了解决图像分类中尺度变化的问题,本文提出了一种深度递减的多列结构的多尺度CNN模型。将输入图像分解为多个尺度,在每个尺度图像上实例化一个CNN列,其深度由细尺度递减到粗尺度,进行模型简化。尺度不变特征是通过在所有尺度上共享权重来学习的,并在相邻尺度之间进行池化。特别提出了一种模仿人类空间频率感知发展的由粗到精的预训练方法来训练这种多尺度CNN,加快了训练过程,降低了分类误差。此外,采用模型平均技术将预训练得到的模型进行组合,进一步提高性能。使用这些方法,我们的模型在CIFAR-10数据集上的分类误差为15.38%,在CIFAR-100数据集上的分类误差为41.29%,分别比单尺度CNN模型降低了1.05%和2.97%。
{"title":"Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification","authors":"Haobin Dou, Xihong Wu","doi":"10.1109/IJCNN.2015.7280542","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280542","url":null,"abstract":"Convolutional Neural Networks (CNNs) have become forceful models in feature learning and image classification. They achieve translation invariance by spatial convolution and pooling mechanisms, while their ability in scale invariance is limited. To tackle the problem of scale variation in image classification, this work proposed a multi-scale CNN model with depth-decreasing multi-column structure. Input images were decomposed into multiple scales and at each scale image, a CNN column was instantiated with its depth decreasing from fine to coarse scale for model simplification. Scale-invariant features were learned by weights shared across all scales and pooled among adjacent scales. Particularly, a coarse-to-fine pre-training method imitating the human's development of spatial frequency perception was proposed to train this multi-scale CNN, which accelerated the training process and reduced the classification error. In addition, model averaging technique was used to combine models obtained during pre-training and further improve the performance. With these methods, our model achieved classification errors of 15.38% on CIFAR-10 dataset and 41.29% on CIFAR-100 dataset, i.e. 1.05% and 2.97% reduction compared with single-scale CNN model.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"7 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77134962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Improving load forecast accuracy by clustering consumers using smart meter data 通过使用智能电表数据对用户进行聚类,提高负荷预测的准确性
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280393
Abbas Shahzadeh, A. Khosravi, S. Nahavandi
Utility companies provide electricity to a large number of consumers. These companies need to have an accurate forecast of the next day electricity demand. Any forecast errors will result in either reliability issues or increased costs for the company. Because of the widespread roll-out of smart meters, a large amount of high resolution consumption data is now accessible which was not available in the past. This new data can be used to improve the load forecast and as a result increase the reliability and decrease the expenses of electricity providers. In this paper, a number of methods for improving load forecast using smart meter data are discussed. In these methods, consumers are first divided into a number of clusters. Then a neural network is trained for each cluster and forecasts of these networks are added together in order to form the prediction for the aggregated load. In this paper, it is demonstrated that clustering increases the forecast accuracy significantly. Criteria used for grouping consumers play an important role in this process. In this work, three different feature selection methods for clustering consumers are explained and the effect of feature extraction methods on forecast error is investigated.
公用事业公司为大量消费者提供电力。这些公司需要对第二天的电力需求有一个准确的预测。任何预测错误都会导致可靠性问题或增加公司的成本。由于智能电表的广泛推广,大量高分辨率的用电数据现在可以访问,这在过去是不可用的。这些新数据可用于改进负荷预测,从而提高可靠性并降低电力供应商的费用。本文讨论了利用智能电表数据改进负荷预测的几种方法。在这些方法中,首先将消费者划分为多个集群。然后对每个聚类训练一个神经网络,并将这些网络的预测结果加在一起,形成对聚合负载的预测。本文证明了聚类可以显著提高预测精度。用于对消费者进行分组的标准在此过程中起着重要作用。本文解释了三种不同的消费者聚类特征选择方法,并研究了特征提取方法对预测误差的影响。
{"title":"Improving load forecast accuracy by clustering consumers using smart meter data","authors":"Abbas Shahzadeh, A. Khosravi, S. Nahavandi","doi":"10.1109/IJCNN.2015.7280393","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280393","url":null,"abstract":"Utility companies provide electricity to a large number of consumers. These companies need to have an accurate forecast of the next day electricity demand. Any forecast errors will result in either reliability issues or increased costs for the company. Because of the widespread roll-out of smart meters, a large amount of high resolution consumption data is now accessible which was not available in the past. This new data can be used to improve the load forecast and as a result increase the reliability and decrease the expenses of electricity providers. In this paper, a number of methods for improving load forecast using smart meter data are discussed. In these methods, consumers are first divided into a number of clusters. Then a neural network is trained for each cluster and forecasts of these networks are added together in order to form the prediction for the aggregated load. In this paper, it is demonstrated that clustering increases the forecast accuracy significantly. Criteria used for grouping consumers play an important role in this process. In this work, three different feature selection methods for clustering consumers are explained and the effect of feature extraction methods on forecast error is investigated.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"9 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82143701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Analysis and evaluation of smartphone-based human activity recognition using a neural network approach 基于智能手机的人类活动识别的神经网络方法分析与评价
Pub Date : 2015-07-12 DOI: 10.1109/IJCNN.2015.7280494
Yongjin Kwon, K. Kang, C. Bae
It has been more important to measure daily physical activity for several purposes. There have been a number of methods of measuring physical activity, such as self-reporting, attaching wearable sensors, etc. Since a smartphone has become widespread rapidly, physical activity can be easily measured by accelerometers in the smartphone. Although there were a number of studies for activity recognition exploiting smartphone acceleration data, there was little discussion with the influence of each axis of accelerometers for activity recognition. In this paper, we investigate how each axis of smartphone acceleration data is affected on the performance of human activity recognition using a neural network based classifier. Assuming that the smartphone is kept in a pants pocket, the acceleration data of a subject are collected during standing, walking, and running for ten minutes. A multilayer perceptron was used as an activity classifier to recognize the three activities. Using averages as features, the classifier with the x-axis features provides the best accuracies. Using standard deviations as features, however, the accuracies are better than those using averages.
为了几个目的,测量每天的身体活动更为重要。测量身体活动的方法有很多,比如自我报告、附加可穿戴传感器等。由于智能手机已经迅速普及,身体活动可以很容易地通过智能手机上的加速度计来测量。虽然有很多利用智能手机加速数据进行活动识别的研究,但很少讨论加速度计各轴对活动识别的影响。在本文中,我们使用基于神经网络的分类器研究智能手机加速度数据的每个轴如何影响人类活动识别的性能。假设智能手机放在裤子口袋里,收集受试者站立、行走、跑步十分钟的加速度数据。使用多层感知器作为活动分类器来识别这三种活动。使用平均值作为特征,具有x轴特征的分类器提供了最好的准确性。然而,使用标准偏差作为特征,准确性优于使用平均值。
{"title":"Analysis and evaluation of smartphone-based human activity recognition using a neural network approach","authors":"Yongjin Kwon, K. Kang, C. Bae","doi":"10.1109/IJCNN.2015.7280494","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280494","url":null,"abstract":"It has been more important to measure daily physical activity for several purposes. There have been a number of methods of measuring physical activity, such as self-reporting, attaching wearable sensors, etc. Since a smartphone has become widespread rapidly, physical activity can be easily measured by accelerometers in the smartphone. Although there were a number of studies for activity recognition exploiting smartphone acceleration data, there was little discussion with the influence of each axis of accelerometers for activity recognition. In this paper, we investigate how each axis of smartphone acceleration data is affected on the performance of human activity recognition using a neural network based classifier. Assuming that the smartphone is kept in a pants pocket, the acceleration data of a subject are collected during standing, walking, and running for ten minutes. A multilayer perceptron was used as an activity classifier to recognize the three activities. Using averages as features, the classifier with the x-axis features provides the best accuracies. Using standard deviations as features, however, the accuracies are better than those using averages.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"56 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78868227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2015 International Joint Conference on Neural Networks (IJCNN)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1