Pub Date : 2001-01-29DOI: 10.1109/ACAC.2001.903358
J. Gough
A popular trend in current software technology is to gain program portability by compiling programs to an intermediate form based on an abstract machine definition. Such approaches date back at least to the 1970s, but have achieved new impetus based on the current popularity of the programming language Java. Implementations of language Java compile programs to bytecodes understood by the Java Virtual Machine (JVM). More recently Microsoft have released preliminary details of their ".NET" platform, which is based on an abstract machine superficially similar to the JVM. In each case program execution is normally mediated by a just in time compiler (JIT), although in principle interpretative execution is also possible. Although these two competing technologies share some common aims the objectives of the virtual machine designs are significantly different. In particular, the ease with which embedded systems might use small-footprint versions of these virtual machines depends on detailed properties of the machine definitions. In this study, a compiler was implemented which can produce output code that may be run on either the JVM or .NET platforms. The compiler is available in the public domain, and facilitates comparisons to be made both at compile time and at runtime.
{"title":"Stacking them up: a comparison of virtual machines","authors":"J. Gough","doi":"10.1109/ACAC.2001.903358","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903358","url":null,"abstract":"A popular trend in current software technology is to gain program portability by compiling programs to an intermediate form based on an abstract machine definition. Such approaches date back at least to the 1970s, but have achieved new impetus based on the current popularity of the programming language Java. Implementations of language Java compile programs to bytecodes understood by the Java Virtual Machine (JVM). More recently Microsoft have released preliminary details of their \".NET\" platform, which is based on an abstract machine superficially similar to the JVM. In each case program execution is normally mediated by a just in time compiler (JIT), although in principle interpretative execution is also possible. Although these two competing technologies share some common aims the objectives of the virtual machine designs are significantly different. In particular, the ease with which embedded systems might use small-footprint versions of these virtual machines depends on detailed properties of the machine definitions. In this study, a compiler was implemented which can produce output code that may be run on either the JVM or .NET platforms. The compiler is available in the public domain, and facilitates comparisons to be made both at compile time and at runtime.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115721481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-29DOI: 10.1109/ACAC.2001.903361
Shu-Lin Hwang, F. Lai
Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve two cache lines predictions with two fetch units. The scheme of the 2048-entry 2-group with group size 4 can produce an average of 8.4 IPC f. The performance is approximately 66.5% better than the original 2-group GBPs. The added hardware cost (41.5 k bits) is less than 40%.
{"title":"Two cache lines prediction for a wide-issue micro-architecture","authors":"Shu-Lin Hwang, F. Lai","doi":"10.1109/ACAC.2001.903361","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903361","url":null,"abstract":"Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve two cache lines predictions with two fetch units. The scheme of the 2048-entry 2-group with group size 4 can produce an average of 8.4 IPC f. The performance is approximately 66.5% better than the original 2-group GBPs. The added hardware cost (41.5 k bits) is less than 40%.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126645811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-29DOI: 10.1109/ACAC.2001.903355
E. Fardin, P. Munro, Jarred Scagliotta, John Morris
Since parallel processors are generally constrained by the available interprocessor data transfer capability, system designers generally try to push interconnection systems to their limits in bandwidth. Practical and economic systems are constrained by many physical and packaging considerations such as a need to use commercially available connectors. We describe here VisiSolve-a simulator that we have built to predict the behaviour of interconnect systems that can readily be assembled from 'off-the-shelf' components. It uses a finite element approach and predicts the dynamic electric field in the cells of the mesh. The irregular geometries of the individual parts of such components require us to adapt the mesh used in simulations in regions where the needs of a practical connector-small size, low insertion force and automatic assembly-have dictated the shape and path of the conductors. We have adopted a method which uses the constitutive error-the discrepancy between electric fields calculated directly and from /spl nabla//spl times/H when H was calculated directly/spl times/as an indicator that refinement is needed.
{"title":"A simulator for high speed digital communications","authors":"E. Fardin, P. Munro, Jarred Scagliotta, John Morris","doi":"10.1109/ACAC.2001.903355","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903355","url":null,"abstract":"Since parallel processors are generally constrained by the available interprocessor data transfer capability, system designers generally try to push interconnection systems to their limits in bandwidth. Practical and economic systems are constrained by many physical and packaging considerations such as a need to use commercially available connectors. We describe here VisiSolve-a simulator that we have built to predict the behaviour of interconnect systems that can readily be assembled from 'off-the-shelf' components. It uses a finite element approach and predicts the dynamic electric field in the cells of the mesh. The irregular geometries of the individual parts of such components require us to adapt the mesh used in simulations in regions where the needs of a practical connector-small size, low insertion force and automatic assembly-have dictated the shape and path of the conductors. We have adopted a method which uses the constitutive error-the discrepancy between electric fields calculated directly and from /spl nabla//spl times/H when H was calculated directly/spl times/as an indicator that refinement is needed.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130487249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-29DOI: 10.1109/ACAC.2001.903375
G. Wigley, D. Kearney
Traditional reconfigurable computing platforms are designed to be single user and have been acknowledged to be difficult to design applications for. The design tools are still primitive and as reconfigurable computing becomes mainstream the development of new design tools and run time environments is essential. As the number of system gates is reaching 10 million on current FPGAs, there is an increase in demand to share a single FPGA amongst multiple applications. A third party must be introduced to handle the sharing of the FPGA and we therefore introduce the first real single FPGA concurrent multi-user operating system for reconfigurable computers. In this paper we describe the complete operating system for reconfigurable architecture and the implementation details for the first limited multi-user operating system. The first OS is a loader, it allocates FPGA area and it can dynamically partition, place and route applications at run-time. As OS for reconfigurable computing is a new area of research, we also had to develop techniques for regression testing and performance comparison. This involved the development of a test suite.
{"title":"The first real operating system for reconfigurable computers","authors":"G. Wigley, D. Kearney","doi":"10.1109/ACAC.2001.903375","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903375","url":null,"abstract":"Traditional reconfigurable computing platforms are designed to be single user and have been acknowledged to be difficult to design applications for. The design tools are still primitive and as reconfigurable computing becomes mainstream the development of new design tools and run time environments is essential. As the number of system gates is reaching 10 million on current FPGAs, there is an increase in demand to share a single FPGA amongst multiple applications. A third party must be introduced to handle the sharing of the FPGA and we therefore introduce the first real single FPGA concurrent multi-user operating system for reconfigurable computers. In this paper we describe the complete operating system for reconfigurable architecture and the implementation details for the first limited multi-user operating system. The first OS is a loader, it allocates FPGA area and it can dynamically partition, place and route applications at run-time. As OS for reconfigurable computing is a new area of research, we also had to develop techniques for regression testing and performance comparison. This involved the development of a test suite.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127501719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-15DOI: 10.1109/ACAC.2001.903351
A. Edwards, G. Heiser
Component-based programming systems have shown themselves to be a natural way of constructing extensible software. Well-defined interfaces, encapsulation, late binding and polymorphism promote extensibility, yet despite this synergy, components have not been widely employed at the systems level. This is primarily due to the failure of existing component technologies to provide the protection and performance required of systems software. In this paper we identify the requirements for a component system to support secure extensions, and describe the design of such a system on the Mungi OS.
{"title":"Components + security = OS extensibility","authors":"A. Edwards, G. Heiser","doi":"10.1109/ACAC.2001.903351","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903351","url":null,"abstract":"Component-based programming systems have shown themselves to be a natural way of constructing extensible software. Well-defined interfaces, encapsulation, late binding and polymorphism promote extensibility, yet despite this synergy, components have not been widely employed at the systems level. This is primarily due to the failure of existing component technologies to provide the protection and performance required of systems software. In this paper we identify the requirements for a component system to support secure extensions, and describe the design of such a system on the Mungi OS.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127883298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-15DOI: 10.1109/ACAC.2001.903365
Heui ran Lee, P. Becket, B. Appelbe
In this paper, a new architecture called the extendable instruction set computer (EISC) is introduced that addresses the issues of memory size and performance in embedded microprocessor systems. The architecture exhibits an efficient fixed length 16-bit instruction set with short length offset and immediate operands. The offset and immediate operands can be extended to 32 bits via the operation of an extension flag. The code density of the EISC instruction set and its memory transfer performance is shown to be significantly higher than current architectures making it a suitable candidate for the next generation of embedded computer systems. The compact EISC instruction set introduces data dependencies that seemingly limit deep pipeline and superscalar implementations. This paper suggests a mechanism by which these dependencies might be removed in hardware.
{"title":"High-performance extendable instruction set computing","authors":"Heui ran Lee, P. Becket, B. Appelbe","doi":"10.1109/ACAC.2001.903365","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903365","url":null,"abstract":"In this paper, a new architecture called the extendable instruction set computer (EISC) is introduced that addresses the issues of memory size and performance in embedded microprocessor systems. The architecture exhibits an efficient fixed length 16-bit instruction set with short length offset and immediate operands. The offset and immediate operands can be extended to 32 bits via the operation of an extension flag. The code density of the EISC instruction set and its memory transfer performance is shown to be significantly higher than current architectures making it a suitable candidate for the next generation of embedded computer systems. The compact EISC instruction set introduces data dependencies that seemingly limit deep pipeline and superscalar implementations. This paper suggests a mechanism by which these dependencies might be removed in hardware.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115395702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-01-15DOI: 10.1109/ACAC.2001.903349
D. Brodrick, Anwar S. Dawood, N. Bergmann, Melanie Wark
The Australian FedSat satellite will incorporate a payload to validate the use of adaptive computing architectures in spacecraft applications. The technology has many exciting benefits for deployment in spacecraft, but the space environment also represents unique challenges which must be addressed. An important consideration is that modern SRAM Field Programmable Gate Arrays (FPGAs), such as the Xilinx 4000 device used on FedSat, are vulnerable to a range of radiation induced errors. A system is required to detect and mitigate these effects. General strategies have been described in the literature, but this work is believed to be the first deployment of a complete space-ready FPGA error control system. A primary aim of the system is to quantify the range of effects that occur, so emphasis is placed on classifying a wide range of errors. Different strategies have distinct capabilities so the final system employs a blend of detection techniques.
{"title":"Error detection for adaptive computing architectures in spacecraft applications","authors":"D. Brodrick, Anwar S. Dawood, N. Bergmann, Melanie Wark","doi":"10.1109/ACAC.2001.903349","DOIUrl":"https://doi.org/10.1109/ACAC.2001.903349","url":null,"abstract":"The Australian FedSat satellite will incorporate a payload to validate the use of adaptive computing architectures in spacecraft applications. The technology has many exciting benefits for deployment in spacecraft, but the space environment also represents unique challenges which must be addressed. An important consideration is that modern SRAM Field Programmable Gate Arrays (FPGAs), such as the Xilinx 4000 device used on FedSat, are vulnerable to a range of radiation induced errors. A system is required to detect and mitigate these effects. General strategies have been described in the literature, but this work is believed to be the first deployment of a complete space-ready FPGA error control system. A primary aim of the system is to quantify the range of effects that occur, so emphasis is placed on classifying a wide range of errors. Different strategies have distinct capabilities so the final system employs a blend of detection techniques.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116284814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}