{"title":"An empirical study of the scalability aspects of instruction distribution algorithms for clustered processors","authors":"Aneesh Aggarwal, M. Franklin","doi":"10.1109/ISPASS.2001.990696","DOIUrl":null,"url":null,"abstract":"In the sub-micron technology era, wire delays are becoming much more important than gate delays, making it particularly attractive to go for decentralized processors. A number of algorithms have already been proposed for distributing instructions among multiple clusters. In this paper we qualitatively and quantitatively analyze the effect of various hardware parameters on the scalability of different instruction distribution algorithms. Using a set of realistic system parameters, we examine performance differences resulting from different distribution algorithms as well as from specific implementation issues such as the type of interconnect, the fetch size, the cluster issue width, and the cluster window size. Our studies have found that those distribution algorithms that perform relatively better with 4 or fewer clusters are generally not the best ones for a larger number of clusters. Also, the relative performance and scalability of the algorithms are sensitive to different hardware parameters. We also found that, among the existing algorithms, there is no single algorithm that works uniformly best across all hardware configurations. This motivates the need to develop alternate interconnects and instruction distribution algorithms.","PeriodicalId":104148,"journal":{"name":"2001 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2001 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2001.990696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36
Abstract
In the sub-micron technology era, wire delays are becoming much more important than gate delays, making it particularly attractive to go for decentralized processors. A number of algorithms have already been proposed for distributing instructions among multiple clusters. In this paper we qualitatively and quantitatively analyze the effect of various hardware parameters on the scalability of different instruction distribution algorithms. Using a set of realistic system parameters, we examine performance differences resulting from different distribution algorithms as well as from specific implementation issues such as the type of interconnect, the fetch size, the cluster issue width, and the cluster window size. Our studies have found that those distribution algorithms that perform relatively better with 4 or fewer clusters are generally not the best ones for a larger number of clusters. Also, the relative performance and scalability of the algorithms are sensitive to different hardware parameters. We also found that, among the existing algorithms, there is no single algorithm that works uniformly best across all hardware configurations. This motivates the need to develop alternate interconnects and instruction distribution algorithms.