{"title":"Encoding in Neural Networks","authors":"S. Bharitkar","doi":"10.1109/ICMLA.2019.00065","DOIUrl":null,"url":null,"abstract":"Data transforms, parameter re-normalization, and activation functions have gained significant attention in the neural network community over the past several years for improving convergence speed. The results in the literature are for computer vision applications, with batch-normalization (BN) and the Rectified Linear Unit (ReLU) activation attracting attention. In this paper, we present a new approach in data transformation in the context of regression during the synthesis of Head-related Transfer Functions (HRTFs) in the field of audio. The encoding technique whitens the real-valued input data delivered to the first hidden layer of a fully-connected neural network (FCNN) thereby providing the training speedup. The experimental results demonstrate, in a statistically significant way, that the presented data encoding approach outperforms other forms of normalization in terms of convergence speed, lower mean-square error, and robustness to network parameter initialization. Towards this, we used some popular first-and second-order gradient techniques such as scaled conjugate gradient, Extreme Learning Machine (ELM), and stochastic gradient descent with momentum and batch normalization. The improvements, as shown through t-SNE based depiction and analysis on the input covariance matrix, confirm the reduction in the condition number of the input covariance matrix (a process similar to whitening).","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Data transforms, parameter re-normalization, and activation functions have gained significant attention in the neural network community over the past several years for improving convergence speed. The results in the literature are for computer vision applications, with batch-normalization (BN) and the Rectified Linear Unit (ReLU) activation attracting attention. In this paper, we present a new approach in data transformation in the context of regression during the synthesis of Head-related Transfer Functions (HRTFs) in the field of audio. The encoding technique whitens the real-valued input data delivered to the first hidden layer of a fully-connected neural network (FCNN) thereby providing the training speedup. The experimental results demonstrate, in a statistically significant way, that the presented data encoding approach outperforms other forms of normalization in terms of convergence speed, lower mean-square error, and robustness to network parameter initialization. Towards this, we used some popular first-and second-order gradient techniques such as scaled conjugate gradient, Extreme Learning Machine (ELM), and stochastic gradient descent with momentum and batch normalization. The improvements, as shown through t-SNE based depiction and analysis on the input covariance matrix, confirm the reduction in the condition number of the input covariance matrix (a process similar to whitening).