{"title":"MMMP中根据资源需求调度线程的性能研究","authors":"L. Weng, Chen Liu","doi":"10.1109/ICPPW.2010.53","DOIUrl":null,"url":null,"abstract":"The Multi-core Multi-threading Microprocessor introduces not only resource sharing to threads in the same core, e.g., computation resources and private caches, but also isolates those resources within different cores. Moreover, when the Simultaneous Multithreading architecture is employed, the execution resources are fully shared among the concurrently executing threads in the same core, while the isolation is worsened as the number of cores increases. Even though fetch policies regarding how to assign priorities in fetch stage are well designed to manage the shared resources in a core, it is actually the scheduling policy that makes the distributed resources available for workloads, through deciding how to send their threads to cores. On the other hand, threads consume various resources in different phases and Cycles Per Instruction Spent on Memory (CPImem) is used to express their resource demands. Consequently, aiming at better performance via scheduling according to their resource demands, we propose the Mix-Scheduling to evenly mix threads across cores, so that it achieves thread diversity, i.e., CPImem diversity in every core. As a result, it is observed in our experiment that 63% improvement in overall system throughput and 27% improvement in average thread performance, when comparing the Mix-Scheduling policy with the reference policy Mono-Scheduling, which keeps CPImem uniformity among threads in every core on chips. Furthermore, the Mix-Scheduling also makes an essential step towards shortening load latency, because it succeeds in reducing the L2 Cache Miss Rate by 6% from Mono-Scheduling.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"On Better Performance from Scheduling Threads According to Resource Demands in MMMP\",\"authors\":\"L. Weng, Chen Liu\",\"doi\":\"10.1109/ICPPW.2010.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Multi-core Multi-threading Microprocessor introduces not only resource sharing to threads in the same core, e.g., computation resources and private caches, but also isolates those resources within different cores. Moreover, when the Simultaneous Multithreading architecture is employed, the execution resources are fully shared among the concurrently executing threads in the same core, while the isolation is worsened as the number of cores increases. Even though fetch policies regarding how to assign priorities in fetch stage are well designed to manage the shared resources in a core, it is actually the scheduling policy that makes the distributed resources available for workloads, through deciding how to send their threads to cores. On the other hand, threads consume various resources in different phases and Cycles Per Instruction Spent on Memory (CPImem) is used to express their resource demands. Consequently, aiming at better performance via scheduling according to their resource demands, we propose the Mix-Scheduling to evenly mix threads across cores, so that it achieves thread diversity, i.e., CPImem diversity in every core. As a result, it is observed in our experiment that 63% improvement in overall system throughput and 27% improvement in average thread performance, when comparing the Mix-Scheduling policy with the reference policy Mono-Scheduling, which keeps CPImem uniformity among threads in every core on chips. Furthermore, the Mix-Scheduling also makes an essential step towards shortening load latency, because it succeeds in reducing the L2 Cache Miss Rate by 6% from Mono-Scheduling.\",\"PeriodicalId\":415472,\"journal\":{\"name\":\"2010 39th International Conference on Parallel Processing Workshops\",\"volume\":\"106 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 39th International Conference on Parallel Processing Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPPW.2010.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 39th International Conference on Parallel Processing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPPW.2010.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Better Performance from Scheduling Threads According to Resource Demands in MMMP
The Multi-core Multi-threading Microprocessor introduces not only resource sharing to threads in the same core, e.g., computation resources and private caches, but also isolates those resources within different cores. Moreover, when the Simultaneous Multithreading architecture is employed, the execution resources are fully shared among the concurrently executing threads in the same core, while the isolation is worsened as the number of cores increases. Even though fetch policies regarding how to assign priorities in fetch stage are well designed to manage the shared resources in a core, it is actually the scheduling policy that makes the distributed resources available for workloads, through deciding how to send their threads to cores. On the other hand, threads consume various resources in different phases and Cycles Per Instruction Spent on Memory (CPImem) is used to express their resource demands. Consequently, aiming at better performance via scheduling according to their resource demands, we propose the Mix-Scheduling to evenly mix threads across cores, so that it achieves thread diversity, i.e., CPImem diversity in every core. As a result, it is observed in our experiment that 63% improvement in overall system throughput and 27% improvement in average thread performance, when comparing the Mix-Scheduling policy with the reference policy Mono-Scheduling, which keeps CPImem uniformity among threads in every core on chips. Furthermore, the Mix-Scheduling also makes an essential step towards shortening load latency, because it succeeds in reducing the L2 Cache Miss Rate by 6% from Mono-Scheduling.