{"title":"Spin-lock synchronization on the Butterfly and KSR1","authors":"Xiaodong Zhang, R. Castañeda, E. Chan","doi":"10.1109/88.281875","DOIUrl":null,"url":null,"abstract":"The drawbacks of the simple spin-lock limit its effective use to small critical sections. Applications with large critical sections and a large number of processors require more efficient algorithms to minimize processor and network overheads. Variations on the spin-lock have been tested on the Sequent Symmetry, a bus-based shared-memory multiprocessor. Algorithms for scalable synchronization have also been tested on the BBN Butterfly I, a large-scale shared-memory multiprocessor with a multistage interconnection network(MIN). We have extended the investigation to the BBN GP1000 and TC2000, both MIN-based multiprocessors with network contention heavier than that on the Butterfly I. We have also implemented algorithms on Kendall Square Research's KSR1, a hierarchical-ring multiprocessor system, to study the effects of cache coherence. The execution behavior of spin-lock algorithms is significantly different between MIN-based and HR-based architectures. Our tests suggest that HR-based architectures handle network and memory contention more efficiently than MIN-based architectures. However, our results also suggest how spin-locks can be made cost-effective on both.<<ETX>>","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"225 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Parallel & Distributed Technology: Systems & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/88.281875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
The drawbacks of the simple spin-lock limit its effective use to small critical sections. Applications with large critical sections and a large number of processors require more efficient algorithms to minimize processor and network overheads. Variations on the spin-lock have been tested on the Sequent Symmetry, a bus-based shared-memory multiprocessor. Algorithms for scalable synchronization have also been tested on the BBN Butterfly I, a large-scale shared-memory multiprocessor with a multistage interconnection network(MIN). We have extended the investigation to the BBN GP1000 and TC2000, both MIN-based multiprocessors with network contention heavier than that on the Butterfly I. We have also implemented algorithms on Kendall Square Research's KSR1, a hierarchical-ring multiprocessor system, to study the effects of cache coherence. The execution behavior of spin-lock algorithms is significantly different between MIN-based and HR-based architectures. Our tests suggest that HR-based architectures handle network and memory contention more efficiently than MIN-based architectures. However, our results also suggest how spin-locks can be made cost-effective on both.<>