用一个非常大的反向传播神经网络对CM-5和Cray机器进行基准测试

Xiao Liu, G. Wilcox
{"title":"用一个非常大的反向传播神经网络对CM-5和Cray机器进行基准测试","authors":"Xiao Liu, G. Wilcox","doi":"10.1109/ICNN.1994.374132","DOIUrl":null,"url":null,"abstract":"In this paper, we present a new, efficient implementation of the backpropagation algorithm (BP) on the CM-5 by fully taking advantage of its Control Network to avoid explicit message-passing. The nodes in the input and output layers are evenly distributed to all processors: all nodes in the hidden layer(s) are replicated in each processor, and all weights are distributed to all processors corresponding to the nodes. We have implemented this algorithm on the CM-5 in the MIMD mode using the C programming language. For a case study of protein tertiary structure prediction, we obtained performance of 76 million weight updates per second (WUPS) with the machine partitioned for 512 processors without vector units. Experiments using different sized partitions indicated an almost linear relationship between the computation time and the number of processors, indicating good parallelization. We have also implemented the backpropagation algorithm on the Cray machines using the C programming language. The Cray-2 implementation yields performance of 10 million WUPS; the Cray X-MP EA implementation yields 18 million WUPS; and the Cray Y-MP M92 implementation yields 40 million WUPS.<<ETX>>","PeriodicalId":209128,"journal":{"name":"Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Benchmarking of the CM-5 and the Cray machines with a very large backpropagation neural network\",\"authors\":\"Xiao Liu, G. Wilcox\",\"doi\":\"10.1109/ICNN.1994.374132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a new, efficient implementation of the backpropagation algorithm (BP) on the CM-5 by fully taking advantage of its Control Network to avoid explicit message-passing. The nodes in the input and output layers are evenly distributed to all processors: all nodes in the hidden layer(s) are replicated in each processor, and all weights are distributed to all processors corresponding to the nodes. We have implemented this algorithm on the CM-5 in the MIMD mode using the C programming language. For a case study of protein tertiary structure prediction, we obtained performance of 76 million weight updates per second (WUPS) with the machine partitioned for 512 processors without vector units. Experiments using different sized partitions indicated an almost linear relationship between the computation time and the number of processors, indicating good parallelization. We have also implemented the backpropagation algorithm on the Cray machines using the C programming language. The Cray-2 implementation yields performance of 10 million WUPS; the Cray X-MP EA implementation yields 18 million WUPS; and the Cray Y-MP M92 implementation yields 40 million WUPS.<<ETX>>\",\"PeriodicalId\":209128,\"journal\":{\"name\":\"Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNN.1994.374132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNN.1994.374132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

在本文中,我们提出了一种新的,有效的反向传播算法(BP)在CM-5上的实现,充分利用其控制网络来避免显式的消息传递。输入和输出层的节点均匀分布到所有处理器中:隐藏层的所有节点在每个处理器中复制,所有权重分布到节点对应的所有处理器中。我们用C语言在CM-5上以MIMD模式实现了该算法。对于蛋白质三级结构预测的案例研究,我们获得了每秒7600万次权重更新(WUPS)的性能,机器被划分为512个处理器,没有向量单元。使用不同大小分区的实验表明,计算时间与处理器数量之间几乎呈线性关系,表明并行性良好。我们还使用C语言在Cray机器上实现了反向传播算法。Cray-2实现的性能为1000万WUPS;Cray X-MP EA实现产生1800万WUPS;克雷Y-MP M92的实现产生了4000万wps。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Benchmarking of the CM-5 and the Cray machines with a very large backpropagation neural network
In this paper, we present a new, efficient implementation of the backpropagation algorithm (BP) on the CM-5 by fully taking advantage of its Control Network to avoid explicit message-passing. The nodes in the input and output layers are evenly distributed to all processors: all nodes in the hidden layer(s) are replicated in each processor, and all weights are distributed to all processors corresponding to the nodes. We have implemented this algorithm on the CM-5 in the MIMD mode using the C programming language. For a case study of protein tertiary structure prediction, we obtained performance of 76 million weight updates per second (WUPS) with the machine partitioned for 512 processors without vector units. Experiments using different sized partitions indicated an almost linear relationship between the computation time and the number of processors, indicating good parallelization. We have also implemented the backpropagation algorithm on the Cray machines using the C programming language. The Cray-2 implementation yields performance of 10 million WUPS; the Cray X-MP EA implementation yields 18 million WUPS; and the Cray Y-MP M92 implementation yields 40 million WUPS.<>
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A neural network model of the binocular fusion in the human vision Neural network hardware performance criteria Accelerating the training of feedforward neural networks using generalized Hebbian rules for initializing the internal representations Improving generalization performance by information minimization Improvement of speed control performance using PID type neurocontroller in an electric vehicle system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1