Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579918
Haiqian Yu, M. Leeser
Most image processing applications are not only computationally intensive, but also data intensive. Reconfigurable hardware boards provide a convenient and flexible solution to speed up these algorithms. To get a high performance design without going through the time-consuming hardware design process for each different algorithm, we present a simple design flow for window-based image processing applications. By finding the three upper bounds according to area constraints, memory bandwidth constraints and on-chip memory constraints, the block structure of the design which can fully utilized the available resources on the board is determined. A new buffering method is also discussed in this paper to build an efficient memory hierarchy for this type of application.
{"title":"Optimizing data intensive window-based image processing on reconfigurable hardware boards","authors":"Haiqian Yu, M. Leeser","doi":"10.1109/SIPS.2005.1579918","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579918","url":null,"abstract":"Most image processing applications are not only computationally intensive, but also data intensive. Reconfigurable hardware boards provide a convenient and flexible solution to speed up these algorithms. To get a high performance design without going through the time-consuming hardware design process for each different algorithm, we present a simple design flow for window-based image processing applications. By finding the three upper bounds according to area constraints, memory bandwidth constraints and on-chip memory constraints, the block structure of the design which can fully utilized the available resources on the board is determined. A new buffering method is also discussed in this paper to build an efficient memory hierarchy for this type of application.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121773311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579832
S. Pollin, R. Goyens, W. Cleeren, B. Bougard
Future wireless communication devices are expected to support a wide range of applications, while coping with stringent energy budget requirements. Delivering at each moment in time the required performance with minimal energy consumption is a promising energy management technique to enable pervasive wireless networking. Considering transmission energy only, the use of multiple small hops results in a decrease of the energy consumption. On the other hand, decreasing the transmission rate of a single hop similarly results in a decrease of the energy needed to deliver a bit. In this paper we compare the use of multiple small hops along different paths with a single large hop in the energy-throughput design space. In contrast with earlier work, realistic transceiver models are used, that cover the complete MAC, transmit and receive chain and support different transmission rates. Results show that, compared to single hop link adaptation, the use of multiple hops in indoor environments is only optimal in the energy-throughput space for distances larger than 30 m or when there are obstacles present that can be avoided in alternative paths. For those larger distances, significant gains are possible though. Hence, to achieve energy optimal operation in 802.11a networks, it is important to adapt jointly the physical layer constellation and network layer path selection.
{"title":"Cross-layer energy-throughput evaluation of multi-hop/path communication and link adaptation for IEEE 802.11a","authors":"S. Pollin, R. Goyens, W. Cleeren, B. Bougard","doi":"10.1109/SIPS.2005.1579832","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579832","url":null,"abstract":"Future wireless communication devices are expected to support a wide range of applications, while coping with stringent energy budget requirements. Delivering at each moment in time the required performance with minimal energy consumption is a promising energy management technique to enable pervasive wireless networking. Considering transmission energy only, the use of multiple small hops results in a decrease of the energy consumption. On the other hand, decreasing the transmission rate of a single hop similarly results in a decrease of the energy needed to deliver a bit. In this paper we compare the use of multiple small hops along different paths with a single large hop in the energy-throughput design space. In contrast with earlier work, realistic transceiver models are used, that cover the complete MAC, transmit and receive chain and support different transmission rates. Results show that, compared to single hop link adaptation, the use of multiple hops in indoor environments is only optimal in the energy-throughput space for distances larger than 30 m or when there are obstacles present that can be avoided in alternative paths. For those larger distances, significant gains are possible though. Hence, to achieve energy optimal operation in 802.11a networks, it is important to adapt jointly the physical layer constellation and network layer path selection.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127706353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579922
Jie-Cherng Liu
A family of the generalized quadrature modulation (GQM) structures is proposed that includes a prototype FIR or IIR filter within the structure. By choosing a proper filter and setting a single parameter many applications can be exploited, such as tunable band-pass filtering, band-pass Hubert transformation, single-sideband processing, band-inversion processing, etc. In other words, the GQM structures generalize the existing quadrature modulation systems and also have the potential to exploit other applications.
{"title":"The generalized quadrature modulation structures","authors":"Jie-Cherng Liu","doi":"10.1109/SIPS.2005.1579922","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579922","url":null,"abstract":"A family of the generalized quadrature modulation (GQM) structures is proposed that includes a prototype FIR or IIR filter within the structure. By choosing a proper filter and setting a single parameter many applications can be exploited, such as tunable band-pass filtering, band-pass Hubert transformation, single-sideband processing, band-inversion processing, etc. In other words, the GQM structures generalize the existing quadrature modulation systems and also have the potential to exploit other applications.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134212254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579953
J. Arnabat-Benedicto, F.C. Tormo
There is not a single scaling technique that suites all kind of images. Final image quality (IQ) depends not only on the scale factor but also on the type of image (photo, CAD, text...) the user is willing to print or display. Formally, any scaling operation can be interpreted as a combination of an anti-alias filter and an interpolation by continuous convolution. In this paper we present a hardware architecture based on this formal framework that performs two dimensional (2-D) separable image up- and down-scaling with a high degree of flexibility and a low hardware cost. In particular, in this paper we propose a convolution interpolator with a programmable kernel memory, we develop a design rule for optimizing the kernel coefficient memory size and we report a flexible anti-alias filter. The increased flexibility provided by the combination of the aforementioned elements renders superior IQ since the scaling technique and parameters can be adjusted to each specific type of image.
{"title":"Flexible hardware architecture for 2-D separable scaling using convolution interpolation","authors":"J. Arnabat-Benedicto, F.C. Tormo","doi":"10.1109/SIPS.2005.1579953","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579953","url":null,"abstract":"There is not a single scaling technique that suites all kind of images. Final image quality (IQ) depends not only on the scale factor but also on the type of image (photo, CAD, text...) the user is willing to print or display. Formally, any scaling operation can be interpreted as a combination of an anti-alias filter and an interpolation by continuous convolution. In this paper we present a hardware architecture based on this formal framework that performs two dimensional (2-D) separable image up- and down-scaling with a high degree of flexibility and a low hardware cost. In particular, in this paper we propose a convolution interpolator with a programmable kernel memory, we develop a design rule for optimizing the kernel coefficient memory size and we report a flexible anti-alias filter. The increased flexibility provided by the combination of the aforementioned elements renders superior IQ since the scaling technique and parameters can be adjusted to each specific type of image.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115798908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579868
P. Bougas, A. Tsirikos, P. Kalivas, K. Pekmestzi
In this paper, a novel architecture for the implementation of serial parallel multipliers (SPM) is proposed. The proposed multiplier is based on a segmentation technique of a simple SPM to blocks of equal bit length. This multiplier achieves higher throughput because it requires small number of zeros to start a new multiplication cycle at a moderate hardware expense and achieves significant hardware reduction compared to the double precision SPM. The proposed technique permits the optimization of the area time product.
{"title":"Segmenetation based design of serial parallel multipliers","authors":"P. Bougas, A. Tsirikos, P. Kalivas, K. Pekmestzi","doi":"10.1109/SIPS.2005.1579868","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579868","url":null,"abstract":"In this paper, a novel architecture for the implementation of serial parallel multipliers (SPM) is proposed. The proposed multiplier is based on a segmentation technique of a simple SPM to blocks of equal bit length. This multiplier achieves higher throughput because it requires small number of zeros to start a new multiplication cycle at a moderate hardware expense and achieves significant hardware reduction compared to the double precision SPM. The proposed technique permits the optimization of the area time product.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132706834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579859
S. Tota, M. Casu, M. Roch, M. Zamboni
The intra-chip communication infrastructures are receiving always more attention since they are becoming a crucial part in the development of current SoCs. Due to the high availability of pre-characterized hard-IP, the complexity of the design is moving toward global interconnections which are introducing always more constraints at each technology node. Power consumption, timing closure, bandwidth requirements, time to market, are some of the factors that are leading to the proposal of new solutions for next generation multi-million SoCs. The need of high programmable systems and the high gate-count availability is moving always more attention on multiprocessors systems (MP-SoC) and so an adequate solution must be found for the communication infrastructure. One of the most promising technologies is the network-on-chip (NoC) architecture, which seems to better fit with the new demanding complexity of such systems. Before starting to develop new solutions, it is crucial to fully understand if and when current bus architectures introduce strong limitations in the development of high speed systems. This article describes a case study of a multiprocessor based ethernet packet-switch application with a shared-bus communication infrastructure. This system aims to depict all the bottlenecks which a shared-bus introduces under heavy load. What emerges from this analysis is that, as expected, a shared-bus is not scalable and it strongly limits whole system performances. These results strengthen the hypothesis that new communication architectures (like the NoC) must be found.
{"title":"A multiprocessor based packet-switch: performance analysis of the communication infrastructure","authors":"S. Tota, M. Casu, M. Roch, M. Zamboni","doi":"10.1109/SIPS.2005.1579859","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579859","url":null,"abstract":"The intra-chip communication infrastructures are receiving always more attention since they are becoming a crucial part in the development of current SoCs. Due to the high availability of pre-characterized hard-IP, the complexity of the design is moving toward global interconnections which are introducing always more constraints at each technology node. Power consumption, timing closure, bandwidth requirements, time to market, are some of the factors that are leading to the proposal of new solutions for next generation multi-million SoCs. The need of high programmable systems and the high gate-count availability is moving always more attention on multiprocessors systems (MP-SoC) and so an adequate solution must be found for the communication infrastructure. One of the most promising technologies is the network-on-chip (NoC) architecture, which seems to better fit with the new demanding complexity of such systems. Before starting to develop new solutions, it is crucial to fully understand if and when current bus architectures introduce strong limitations in the development of high speed systems. This article describes a case study of a multiprocessor based ethernet packet-switch application with a shared-bus communication infrastructure. This system aims to depict all the bottlenecks which a shared-bus introduces under heavy load. What emerges from this analysis is that, as expected, a shared-bus is not scalable and it strongly limits whole system performances. These results strengthen the hypothesis that new communication architectures (like the NoC) must be found.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115077569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579906
D. Gavrilis, I. Tsoulos
There is an ongoing effort to develop advanced methods and computer-based systems to assist obstetricians in the difficult task of feature extraction and classification of the cardiotocogram (CTG), which is the most widely used electronic fetal monitoring (EFM) method worldwide. A novel method for feature construction is presented for efficient classification of CTG based on information extracted from fetal heart rate (FHR) signal. The proposed method is based on grammatical evolution in order to construct new features from existing ones using nonlinear transformations. This method is tested on a data set of intrapartum cases achieving accuracy of 92.5%.
{"title":"Classification of fetal heart rate using grammatical evolution","authors":"D. Gavrilis, I. Tsoulos","doi":"10.1109/SIPS.2005.1579906","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579906","url":null,"abstract":"There is an ongoing effort to develop advanced methods and computer-based systems to assist obstetricians in the difficult task of feature extraction and classification of the cardiotocogram (CTG), which is the most widely used electronic fetal monitoring (EFM) method worldwide. A novel method for feature construction is presented for efficient classification of CTG based on information extracted from fetal heart rate (FHR) signal. The proposed method is based on grammatical evolution in order to construct new features from existing ones using nonlinear transformations. This method is tested on a data set of intrapartum cases achieving accuracy of 92.5%.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121689952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579890
E. Zois, A. Nassiopoulos, V. Anastassopoulos
A novel technique is presented for off-line signature recognition and verification. The feature extraction procedure employs directional-vectors, similar to those used in chain codes, which provide a global measure of the signature image. The signature trace is transformed into the feature vector by measuring the directional strength of line segments having a chessboard distance equal to two. A probabilistic neural topology is employed for the design of the classifier. In order to obtain comparable results, the method was applied to a database already used in the literature. The verification procedure provides low classification error for authentic signatures while it eliminates the forgers.
{"title":"Signature verification based on line directionality","authors":"E. Zois, A. Nassiopoulos, V. Anastassopoulos","doi":"10.1109/SIPS.2005.1579890","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579890","url":null,"abstract":"A novel technique is presented for off-line signature recognition and verification. The feature extraction procedure employs directional-vectors, similar to those used in chain codes, which provide a global measure of the signature image. The signature trace is transformed into the feature vector by measuring the directional strength of line segments having a chessboard distance equal to two. A probabilistic neural topology is employed for the design of the classifier. In order to obtain comparable results, the method was applied to a database already used in the literature. The verification procedure provides low classification error for authentic signatures while it eliminates the forgers.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124596659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579954
G. Pastuszak
The architecture for EBCOT in JPEG 2000 is presented. The architecture embeds all functions necessary to produce the final codestream consistent with the JPEG 2000 specification. A number of hardware optimisation methods are used to achieve the high throughput at relatively low cost of hardware resources. The architecture is verified in simulations and synthesized for ASIC and FPGA technologies. Implementation results for FPGA Stratix II devices show that it can work at 120 MHz and process about 40 million samples per second in the regular lossless mode.
{"title":"A high-performance architecture for EBCOT in the JPEG 2000 encoder","authors":"G. Pastuszak","doi":"10.1109/SIPS.2005.1579954","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579954","url":null,"abstract":"The architecture for EBCOT in JPEG 2000 is presented. The architecture embeds all functions necessary to produce the final codestream consistent with the JPEG 2000 specification. A number of hardware optimisation methods are used to achieve the high throughput at relatively low cost of hardware resources. The architecture is verified in simulations and synthesized for ASIC and FPGA technologies. Implementation results for FPGA Stratix II devices show that it can work at 120 MHz and process about 40 million samples per second in the regular lossless mode.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127992861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579882
Mo Li, Ronggang Wang, Wuchen Wu
In this paper, we proposed a parallel and pipeline architecture for the sub-pixel interpolation filter in H.264/AVC conformed HDTV decoder. To efficiently use the bus bandwidth, we bring forward two memory access optimization strategies to avoid redundant data transfer and improve data bus utilization. To improve the processing throughput, we use parallel and multi-stage pipeline architecture for conducting data transmission and interpolation filtering in parallel. As compared to the traditional designs, our scheme offers 60% reduced memory data transfer. While clocking at 66 MHz, our design can support 1280/spl times/720 at 30 Hz processing throughput. The proposed design is suitable for system-on-chip design.
{"title":"The high throughput and low memory access design of sub-pixel interpolation for H.264/AVC HDTV decoder","authors":"Mo Li, Ronggang Wang, Wuchen Wu","doi":"10.1109/SIPS.2005.1579882","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579882","url":null,"abstract":"In this paper, we proposed a parallel and pipeline architecture for the sub-pixel interpolation filter in H.264/AVC conformed HDTV decoder. To efficiently use the bus bandwidth, we bring forward two memory access optimization strategies to avoid redundant data transfer and improve data bus utilization. To improve the processing throughput, we use parallel and multi-stage pipeline architecture for conducting data transmission and interpolation filtering in parallel. As compared to the traditional designs, our scheme offers 60% reduced memory data transfer. While clocking at 66 MHz, our design can support 1280/spl times/720 at 30 Hz processing throughput. The proposed design is suitable for system-on-chip design.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126808122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}