{"title":"Exploitation of operation-level parallelism in a processor of the CRAY X-MP","authors":"S. Vajapeyam, G. Sohi, W. Hsu","doi":"10.1109/ICCD.1990.130149","DOIUrl":null,"url":null,"abstract":"Available operation-level parallelism and its exploitation in the CRAY X-MP processor are studied. Considered are the sizes and contributions to execution time of basic blocks, instruction and operation issue rates and issue stalls, and operation execution overlap for entire executions of three large programs, FLO52, TRFD, and QCD1, taken from the Perfect Club benchmark set. The large basic blocks account for a significant portion of the overall execution time. It is also found that with the use of vector instructions, the X-MP is able to issue more than one operation per clock cycle, even though it can issue a maximum of one instruction per cycle.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.1990.130149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Available operation-level parallelism and its exploitation in the CRAY X-MP processor are studied. Considered are the sizes and contributions to execution time of basic blocks, instruction and operation issue rates and issue stalls, and operation execution overlap for entire executions of three large programs, FLO52, TRFD, and QCD1, taken from the Perfect Club benchmark set. The large basic blocks account for a significant portion of the overall execution time. It is also found that with the use of vector instructions, the X-MP is able to issue more than one operation per clock cycle, even though it can issue a maximum of one instruction per cycle.<>