Y. Ben-Asher, Irina Lipov, V. Tartakovsky, Dror Tiv
{"title":"使用多操作指令生成具有优化管道结构的api","authors":"Y. Ben-Asher, Irina Lipov, V. Tartakovsky, Dror Tiv","doi":"10.1109/FCCM.2014.16","DOIUrl":null,"url":null,"abstract":"We propose automatic synthesis of application specific instruction set processors (ASIPs). We use pipeline execution of multi-op machine-instructions, e.g., *(reg1*reg2) = (*reg3) + (*reg4) (C-syntax) an instruction with three memory stages and two arithmetic stages pipeline. The problem is, for a given set of loops, to find a pipeline configuration and a multi-op ISA that maximizes the IPC (instructions per cycle) while minimizing the resource usage and the cost of interconnections to the register-file of the resulting CPU. The algorithm is based on finding an efficient cover of a large graph by a small set of convex sub-graphs gis that are consistent with a given structure of a pipeline. Unlike previous works, gis are not synthesized to circuits that are executed in a co-processor mode but rather both gis and the rest of the program are executed by the same set of multiop pipeline units. In this way we eliminate the overhead associated with the co-processor mode of regular ASIPs but maintain high values of IPC of these ASIPs. The main advantage of using pipeline execution of multi-op versus VLIW instructions is shown to be the cost of interconnections between the CPU's execution units and the register file. Thus, we devise a grading function that for each possible multi-op pipeline configuration balance between the expected IPC (Instructions Per Cycle) and the complexity of the interconnections. Using this grading function we show that in most cases the VLIW configuration is not always the best choice.","PeriodicalId":246162,"journal":{"name":"2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using Multi-op Instructions as a Way to Generate ASIPs with Optimized Pipeline Structure\",\"authors\":\"Y. Ben-Asher, Irina Lipov, V. Tartakovsky, Dror Tiv\",\"doi\":\"10.1109/FCCM.2014.16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose automatic synthesis of application specific instruction set processors (ASIPs). We use pipeline execution of multi-op machine-instructions, e.g., *(reg1*reg2) = (*reg3) + (*reg4) (C-syntax) an instruction with three memory stages and two arithmetic stages pipeline. The problem is, for a given set of loops, to find a pipeline configuration and a multi-op ISA that maximizes the IPC (instructions per cycle) while minimizing the resource usage and the cost of interconnections to the register-file of the resulting CPU. The algorithm is based on finding an efficient cover of a large graph by a small set of convex sub-graphs gis that are consistent with a given structure of a pipeline. Unlike previous works, gis are not synthesized to circuits that are executed in a co-processor mode but rather both gis and the rest of the program are executed by the same set of multiop pipeline units. In this way we eliminate the overhead associated with the co-processor mode of regular ASIPs but maintain high values of IPC of these ASIPs. The main advantage of using pipeline execution of multi-op versus VLIW instructions is shown to be the cost of interconnections between the CPU's execution units and the register file. Thus, we devise a grading function that for each possible multi-op pipeline configuration balance between the expected IPC (Instructions Per Cycle) and the complexity of the interconnections. Using this grading function we show that in most cases the VLIW configuration is not always the best choice.\",\"PeriodicalId\":246162,\"journal\":{\"name\":\"2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2014.16\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2014.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Multi-op Instructions as a Way to Generate ASIPs with Optimized Pipeline Structure
We propose automatic synthesis of application specific instruction set processors (ASIPs). We use pipeline execution of multi-op machine-instructions, e.g., *(reg1*reg2) = (*reg3) + (*reg4) (C-syntax) an instruction with three memory stages and two arithmetic stages pipeline. The problem is, for a given set of loops, to find a pipeline configuration and a multi-op ISA that maximizes the IPC (instructions per cycle) while minimizing the resource usage and the cost of interconnections to the register-file of the resulting CPU. The algorithm is based on finding an efficient cover of a large graph by a small set of convex sub-graphs gis that are consistent with a given structure of a pipeline. Unlike previous works, gis are not synthesized to circuits that are executed in a co-processor mode but rather both gis and the rest of the program are executed by the same set of multiop pipeline units. In this way we eliminate the overhead associated with the co-processor mode of regular ASIPs but maintain high values of IPC of these ASIPs. The main advantage of using pipeline execution of multi-op versus VLIW instructions is shown to be the cost of interconnections between the CPU's execution units and the register file. Thus, we devise a grading function that for each possible multi-op pipeline configuration balance between the expected IPC (Instructions Per Cycle) and the complexity of the interconnections. Using this grading function we show that in most cases the VLIW configuration is not always the best choice.