Pub Date : 1997-07-14DOI: 10.1109/ASAP.1997.606839
B. Kienhuis, E. Deprettere, K. Vissers, P. V. D. Wolf
In this paper we present an approach for quantitative analysis of application-specific dataflow architectures. The approach allows the designer to rate design alternatives in a quantitative way and therefore supports him in the design process to find better performing architectures. The context of our work is video signal processing algorithms which are mapped onto weakly-programmable, coarse-grain dataflow architectures. The algorithms are represented as Kahn graphs with the functionality of the nodes being coarse-grain functions. We have implemented an architecture simulation environment that permits the definition of dataflow architectures as a composition of architecture elements, such as functional units, buffer elements and communication structures. The abstract, clock-cycle accurate simulator has been built using a multi-threading package and employs object oriented principles. This results in a configurable and efficient simulator. Algorithms can subsequently be executed on the architecture model producing quantitative information for selected performance metrics. Results are presented for the simulation of a realistic application on several dataflow architecture alternatives, showing that many different architectures can be simulated in modest time on a modern workstation.
{"title":"An Approach for Quantitative Analysis of Application-Specific Dataflow Architectures","authors":"B. Kienhuis, E. Deprettere, K. Vissers, P. V. D. Wolf","doi":"10.1109/ASAP.1997.606839","DOIUrl":"https://doi.org/10.1109/ASAP.1997.606839","url":null,"abstract":"In this paper we present an approach for quantitative analysis of application-specific dataflow architectures. The approach allows the designer to rate design alternatives in a quantitative way and therefore supports him in the design process to find better performing architectures. The context of our work is video signal processing algorithms which are mapped onto weakly-programmable, coarse-grain dataflow architectures. The algorithms are represented as Kahn graphs with the functionality of the nodes being coarse-grain functions. We have implemented an architecture simulation environment that permits the definition of dataflow architectures as a composition of architecture elements, such as functional units, buffer elements and communication structures. The abstract, clock-cycle accurate simulator has been built using a multi-threading package and employs object oriented principles. This results in a configurable and efficient simulator. Algorithms can subsequently be executed on the architecture model producing quantitative information for selected performance metrics. Results are presented for the simulation of a realistic application on several dataflow architecture alternatives, showing that many different architectures can be simulated in modest time on a modern workstation.","PeriodicalId":6642,"journal":{"name":"2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"16 1","pages":"338-349"},"PeriodicalIF":0.0,"publicationDate":"1997-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74692979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-10-25DOI: 10.1109/ASAP.1993.397121
L. Lucke, K. Parhi
To achieve additional speedup in rank order and stack filter architectures requires the use of parallel processing techniques such as pipelining and block processing. Pipelining is well understood but few block architectures have been developed for rank order and stack filtering. Block processing is essential when the architecture reaches the throughput limits caused by the underlying technology. A trivial block structure repeats a single input, single output structure to generate a multiple input, multiple output structure and can achieve speedups equal to the block size (or the number of multiple outputs). Unlike linear filters, the rank order and stack filter outputs are calculated using comparisons. It is possible to share these comparisons within the block structure. The authors introduce a systematic method for applying block processing to the rank order and stack filters. This method takes advantage of shared comparisons within the block structure to generate a block filter with shared substructures whose complexity is reduced. Furthermore, block processing is important for the generation of low power designs. Trivial block structures generate low power designs up to a certain limit. The authors demonstrate how block structures with shared substructures are used to generate designs with arbitrarily low power. >
{"title":"Parallel processing architectures for rank order and stack filters","authors":"L. Lucke, K. Parhi","doi":"10.1109/ASAP.1993.397121","DOIUrl":"https://doi.org/10.1109/ASAP.1993.397121","url":null,"abstract":"To achieve additional speedup in rank order and stack filter architectures requires the use of parallel processing techniques such as pipelining and block processing. Pipelining is well understood but few block architectures have been developed for rank order and stack filtering. Block processing is essential when the architecture reaches the throughput limits caused by the underlying technology. A trivial block structure repeats a single input, single output structure to generate a multiple input, multiple output structure and can achieve speedups equal to the block size (or the number of multiple outputs). Unlike linear filters, the rank order and stack filter outputs are calculated using comparisons. It is possible to share these comparisons within the block structure. The authors introduce a systematic method for applying block processing to the rank order and stack filters. This method takes advantage of shared comparisons within the block structure to generate a block filter with shared substructures whose complexity is reduced. Furthermore, block processing is important for the generation of low power designs. Trivial block structures generate low power designs up to a certain limit. The authors demonstrate how block structures with shared substructures are used to generate designs with arbitrarily low power. >","PeriodicalId":6642,"journal":{"name":"2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"8 1","pages":"65-76"},"PeriodicalIF":0.0,"publicationDate":"1993-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87025626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-01-01DOI: 10.1109/ASAP.1990.145449
Chi-Min Liu, C. Jen
{"title":"Recursive algorithms for AR spectral estimation and their array realizations","authors":"Chi-Min Liu, C. Jen","doi":"10.1109/ASAP.1990.145449","DOIUrl":"https://doi.org/10.1109/ASAP.1990.145449","url":null,"abstract":"","PeriodicalId":6642,"journal":{"name":"2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"17 1","pages":"121-132"},"PeriodicalIF":0.0,"publicationDate":"1990-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78140721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}