Hui-Ya Li, C. Ou, Yi-Tsan Hung, Wen-Jyi Hwang, Chia-Lung Hung
{"title":"Hardware Implementation of k-Winner-Take-All Neural Network with On-chip Learning","authors":"Hui-Ya Li, C. Ou, Yi-Tsan Hung, Wen-Jyi Hwang, Chia-Lung Hung","doi":"10.1109/CSE.2010.51","DOIUrl":null,"url":null,"abstract":"This paper presents a novel pipelined architecture of the competitive learning (CL) algorithm with k-winners-take-all activation. The architecture employs a codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for the subsequent training vectors. An efficient pipeline architecture is then designed based on the codeword swapping scheme for enhancing the throughput. The CPU time of the NIOS processor executing the CL training with the proposed architecture as an accelerator is measured. Experiment results show that the CPU time is lower than that of other hardware or software implementations running the CL training program with or without the support of custom hardware.","PeriodicalId":342688,"journal":{"name":"2010 13th IEEE International Conference on Computational Science and Engineering","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th IEEE International Conference on Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE.2010.51","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper presents a novel pipelined architecture of the competitive learning (CL) algorithm with k-winners-take-all activation. The architecture employs a codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for the subsequent training vectors. An efficient pipeline architecture is then designed based on the codeword swapping scheme for enhancing the throughput. The CPU time of the NIOS processor executing the CL training with the proposed architecture as an accelerator is measured. Experiment results show that the CPU time is lower than that of other hardware or software implementations running the CL training program with or without the support of custom hardware.