K. Shimamura, Shigeya Tanaka, Tetsuya Shimomura, T. Hotta, E. Kamada, H. Sawamoto, Teruhisa Shimizu, K. Nakazawa
{"title":"A superscalar RISC processor with pseudo vector processing feature","authors":"K. Shimamura, Shigeya Tanaka, Tetsuya Shimomura, T. Hotta, E. Kamada, H. Sawamoto, Teruhisa Shimizu, K. Nakazawa","doi":"10.1109/ICCD.1995.528797","DOIUrl":null,"url":null,"abstract":"A novel architectural extension, in which floating-point data are transferred directly from main memory to floating-point registers, has been successfully implemented in a superscalar RISC processor. This extension allows main memory access throughput of 1.2 Gbyte/s, and effective performance reaches 267 MFLOPS (89% of the peak performance) for typical floating-point applications. The processor utilizes 0.3-micron 4-level metal CMOS technology with 2.5 V power supply and contains 3.9 million transistors in 15.7 mm/spl times/15.7 mm die size. Only 4.5% of the die area is used for the extension. Pipeline stage optimization and scoreboard-based dependency check method allow the extension to be realized without affecting the operating frequency.","PeriodicalId":281907,"journal":{"name":"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.1995.528797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
A novel architectural extension, in which floating-point data are transferred directly from main memory to floating-point registers, has been successfully implemented in a superscalar RISC processor. This extension allows main memory access throughput of 1.2 Gbyte/s, and effective performance reaches 267 MFLOPS (89% of the peak performance) for typical floating-point applications. The processor utilizes 0.3-micron 4-level metal CMOS technology with 2.5 V power supply and contains 3.9 million transistors in 15.7 mm/spl times/15.7 mm die size. Only 4.5% of the die area is used for the extension. Pipeline stage optimization and scoreboard-based dependency check method allow the extension to be realized without affecting the operating frequency.