{"title":"波尔兹曼机的精确硬件实现","authors":"Marcin Skubiszewski","doi":"10.1109/SPDP.1992.242756","DOIUrl":null,"url":null,"abstract":"The author presents a faithful hardware implementation (built on the top of DECPeRLe-1, a reconfigurable coprocessor closely coupled with its host machine, a DECstation 500) of the Boltzmann machine. The prototype performs 505 megasynapses (million of additions and multiplications) per second, using 16-b fixed-point weights. It can emulate fully connected instances of the Boltzmann machine containing up to 1438 variables. This specialized hardware only executes the simplest part of the Boltzmann machine algorithm, namely, multiplying matrices of numbers by vectors of bits. The other operations (which are complicated, but only require a modest amount of computation) are performed by the host processor. It is noted that the key point of this work resides in establishing the right design choices. Among these, the most important ones are the rejection of 'neural parallelism', which makes the implementation exact, and the algorithm used to generate random numbers in software, which allows the hardware to be simple. The fact that DECPeRLe-1 makes hardware development cheap and fast was essential in this work.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"An exact hardware implementation of the Boltzmann machine\",\"authors\":\"Marcin Skubiszewski\",\"doi\":\"10.1109/SPDP.1992.242756\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The author presents a faithful hardware implementation (built on the top of DECPeRLe-1, a reconfigurable coprocessor closely coupled with its host machine, a DECstation 500) of the Boltzmann machine. The prototype performs 505 megasynapses (million of additions and multiplications) per second, using 16-b fixed-point weights. It can emulate fully connected instances of the Boltzmann machine containing up to 1438 variables. This specialized hardware only executes the simplest part of the Boltzmann machine algorithm, namely, multiplying matrices of numbers by vectors of bits. The other operations (which are complicated, but only require a modest amount of computation) are performed by the host processor. It is noted that the key point of this work resides in establishing the right design choices. Among these, the most important ones are the rejection of 'neural parallelism', which makes the implementation exact, and the algorithm used to generate random numbers in software, which allows the hardware to be simple. The fact that DECPeRLe-1 makes hardware development cheap and fast was essential in this work.<<ETX>>\",\"PeriodicalId\":265469,\"journal\":{\"name\":\"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPDP.1992.242756\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPDP.1992.242756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An exact hardware implementation of the Boltzmann machine
The author presents a faithful hardware implementation (built on the top of DECPeRLe-1, a reconfigurable coprocessor closely coupled with its host machine, a DECstation 500) of the Boltzmann machine. The prototype performs 505 megasynapses (million of additions and multiplications) per second, using 16-b fixed-point weights. It can emulate fully connected instances of the Boltzmann machine containing up to 1438 variables. This specialized hardware only executes the simplest part of the Boltzmann machine algorithm, namely, multiplying matrices of numbers by vectors of bits. The other operations (which are complicated, but only require a modest amount of computation) are performed by the host processor. It is noted that the key point of this work resides in establishing the right design choices. Among these, the most important ones are the rejection of 'neural parallelism', which makes the implementation exact, and the algorithm used to generate random numbers in software, which allows the hardware to be simple. The fact that DECPeRLe-1 makes hardware development cheap and fast was essential in this work.<>