R. Ubal, J. Sahuquillo, S. Petit, P. López, J. Duato
{"title":"利用集群处理器中的减迹级并行性","authors":"R. Ubal, J. Sahuquillo, S. Petit, P. López, J. Duato","doi":"10.1145/1854273.1854349","DOIUrl":null,"url":null,"abstract":"The performance evaluation has been carried out on top of the Multi2Sim 2.2 simulation framework [2], a cycle-accurate simulator for x86-based superscalar processors, extended to model a clustered architecture with support for independent subtraces generation. The parameters of the modeled machine are summarized in Table 1. The Mediabench suite has been used to stress the machine, and simulations are stopped after the first 100 million uops commit. The steering algorithm and the interconnection network among clusters are important design factors related with the criticality of the inter-cluster communication latency. For a good baseline performance, the modeled schemes use a sophisticated steering algorithm called topology-aware steering [3], and several interconnection networks with different realistic link delays are considered.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"124 15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploiting subtrace-level parallelism in clustered processors\",\"authors\":\"R. Ubal, J. Sahuquillo, S. Petit, P. López, J. Duato\",\"doi\":\"10.1145/1854273.1854349\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance evaluation has been carried out on top of the Multi2Sim 2.2 simulation framework [2], a cycle-accurate simulator for x86-based superscalar processors, extended to model a clustered architecture with support for independent subtraces generation. The parameters of the modeled machine are summarized in Table 1. The Mediabench suite has been used to stress the machine, and simulations are stopped after the first 100 million uops commit. The steering algorithm and the interconnection network among clusters are important design factors related with the criticality of the inter-cluster communication latency. For a good baseline performance, the modeled schemes use a sophisticated steering algorithm called topology-aware steering [3], and several interconnection networks with different realistic link delays are considered.\",\"PeriodicalId\":422461,\"journal\":{\"name\":\"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"volume\":\"124 15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1854273.1854349\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1854273.1854349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploiting subtrace-level parallelism in clustered processors
The performance evaluation has been carried out on top of the Multi2Sim 2.2 simulation framework [2], a cycle-accurate simulator for x86-based superscalar processors, extended to model a clustered architecture with support for independent subtraces generation. The parameters of the modeled machine are summarized in Table 1. The Mediabench suite has been used to stress the machine, and simulations are stopped after the first 100 million uops commit. The steering algorithm and the interconnection network among clusters are important design factors related with the criticality of the inter-cluster communication latency. For a good baseline performance, the modeled schemes use a sophisticated steering algorithm called topology-aware steering [3], and several interconnection networks with different realistic link delays are considered.