Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798574
R. Suresh, S. Arumugam, L. Ganesan
Fuzzy set theory provides an approximate but effective means of describing the behavior of ill-defined systems. Patterns of human origin such as handwritten characters are to some extent found to be fuzzy in nature. The authors decided to use the fuzzy conceptual approach. The paper attempts to use the fuzzy concept on handwritten Tamil characters to classify them as one among the prototype characters using a feature called distance from the frame and a suitable membership function. The prototype characters are categorized into two classes: one is considered as line characters/patterns and the other as arc patterns. The unknown input character is classified into one of these two classes first and then recognized to be one of the characters in that class. The algorithm is tested for about 250 samples for seven chosen Tamil characters and the success rate obtained varies from 88% to 100%.
{"title":"Fuzzy approach to recognize handwritten Tamil characters","authors":"R. Suresh, S. Arumugam, L. Ganesan","doi":"10.1109/ICCIMA.1999.798574","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798574","url":null,"abstract":"Fuzzy set theory provides an approximate but effective means of describing the behavior of ill-defined systems. Patterns of human origin such as handwritten characters are to some extent found to be fuzzy in nature. The authors decided to use the fuzzy conceptual approach. The paper attempts to use the fuzzy concept on handwritten Tamil characters to classify them as one among the prototype characters using a feature called distance from the frame and a suitable membership function. The prototype characters are categorized into two classes: one is considered as line characters/patterns and the other as arc patterns. The unknown input character is classified into one of these two classes first and then recognized to be one of the characters in that class. The algorithm is tested for about 250 samples for seven chosen Tamil characters and the success rate obtained varies from 88% to 100%.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114683956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798565
F. Vignoli, F. Lavagetto
The bimodal acoustic-visual effect is of extreme importance in human face-to-face communication; it has been broadly investigated and the improvement in understanding when visual cues are integrated with speech has been clearly demonstrated, with particular emphasis in noisy environments. In this paper, we propose a novel synchronization procedure for speech and text, consisting of a neural network-based acoustic segmentation method for phoneme classes and a phonetic-acoustic time alignment algorithm which we call Segmental Time-Alignment (STA). The proposed algorithm is fast and speaker-independent since it uses neural networks trained to discriminate among broad phoneme classes. This technique has been used to animate the MPEG-4 compliant DIST face model.
{"title":"A segmental time-alignment technique for text-speech synchronization","authors":"F. Vignoli, F. Lavagetto","doi":"10.1109/ICCIMA.1999.798565","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798565","url":null,"abstract":"The bimodal acoustic-visual effect is of extreme importance in human face-to-face communication; it has been broadly investigated and the improvement in understanding when visual cues are integrated with speech has been clearly demonstrated, with particular emphasis in noisy environments. In this paper, we propose a novel synchronization procedure for speech and text, consisting of a neural network-based acoustic segmentation method for phoneme classes and a phonetic-acoustic time alignment algorithm which we call Segmental Time-Alignment (STA). The proposed algorithm is fast and speaker-independent since it uses neural networks trained to discriminate among broad phoneme classes. This technique has been used to animate the MPEG-4 compliant DIST face model.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117160626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798538
J. Karlelar, U. Desai
A novel hybrid compression scheme for videoconferencing and videotelephony applications at very low bit rates (i.e., 32 Kbits/s) is presented. The human face is the most important region within a frame and should be coded with high fidelity. To preserve perceptually important information at low bit rates, such as face regions, skin-tone is used to detect and adaptively quantize these regions. Novel features of this coder are the use of overlapping block motion compensation in combination with discrete wavelet transform, followed by zerotree entropy coding with new scanning procedure of wavelet blocks such that the rest of the H.263 framework can be used. At the same total bit-rate, coarser quantization of the background enables the face region to be quantized finely and coded with higher quality.
{"title":"Content based video compression for high perceptual quality videoconferencing using wavelet transform","authors":"J. Karlelar, U. Desai","doi":"10.1109/ICCIMA.1999.798538","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798538","url":null,"abstract":"A novel hybrid compression scheme for videoconferencing and videotelephony applications at very low bit rates (i.e., 32 Kbits/s) is presented. The human face is the most important region within a frame and should be coded with high fidelity. To preserve perceptually important information at low bit rates, such as face regions, skin-tone is used to detect and adaptively quantize these regions. Novel features of this coder are the use of overlapping block motion compensation in combination with discrete wavelet transform, followed by zerotree entropy coding with new scanning procedure of wavelet blocks such that the rest of the H.263 framework can be used. At the same total bit-rate, coarser quantization of the background enables the face region to be quantized finely and coded with higher quality.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126746906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798555
S. Wei, V. Tsaoussidis, V. Venkatakrishnan
Traditionally, the provided service of Internet transport protocols is reliable/ordered or unreliable/unordered service, respectively. It has been widely accepted that a variety of applications (e.g. multimedia) can tradeoff reliability mechanisms to achieve higher throughput. Such a solution, which additionally respects user preferences for QoS characteristics (e.g. reliability, cost, throughput, delay) is a current demand. We propose an application-oriented transport protocol (AOTP) to handle partially or completely reliable transport service favoring throughput at the expense of reliability, or dropping the reliability level in order to keep the cost at a desired level. The protocol can be applied to cases where resource reservation is not possible or desired. Our approach does not use forward error correction to save bandwidth, but uses instead, a receiver-based retransmission mechanism. Therefore, the protocol is appropriate for applications that tolerate losses and thus, the need for retransmission (additional RTTs) does not arise often. We present encouraging initial results tested over Ethernet links; we compare AOTP with TCP and a TCP-like protocol without congestion control (TCPWCC), contrasting throughput results for different levels of reliability requirements.
{"title":"QoS tradeoffs using an application-oriented transport protocol (AOTP) for multimedia applications over IP","authors":"S. Wei, V. Tsaoussidis, V. Venkatakrishnan","doi":"10.1109/ICCIMA.1999.798555","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798555","url":null,"abstract":"Traditionally, the provided service of Internet transport protocols is reliable/ordered or unreliable/unordered service, respectively. It has been widely accepted that a variety of applications (e.g. multimedia) can tradeoff reliability mechanisms to achieve higher throughput. Such a solution, which additionally respects user preferences for QoS characteristics (e.g. reliability, cost, throughput, delay) is a current demand. We propose an application-oriented transport protocol (AOTP) to handle partially or completely reliable transport service favoring throughput at the expense of reliability, or dropping the reliability level in order to keep the cost at a desired level. The protocol can be applied to cases where resource reservation is not possible or desired. Our approach does not use forward error correction to save bandwidth, but uses instead, a receiver-based retransmission mechanism. Therefore, the protocol is appropriate for applications that tolerate losses and thus, the need for retransmission (additional RTTs) does not arise often. We present encouraging initial results tested over Ethernet links; we compare AOTP with TCP and a TCP-like protocol without congestion control (TCPWCC), contrasting throughput results for different levels of reliability requirements.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"309 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126023029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798545
K. Tabb, S. George, R. Adams, N. Davey
This paper documents experiments which have been carried out with several neural network systems designed to categorise pedestrian shapes from non-pedestrian shapes. Active contour models ('snakes') have been used to obtain contours of pedestrians as they move around the visual field. Neural networks have then been trained on representations of these relaxed snakes, and can successfully discriminate these contours based upon whether they are 'pedestrian' in shape or not.
{"title":"Human shape recognition from snakes using neural networks","authors":"K. Tabb, S. George, R. Adams, N. Davey","doi":"10.1109/ICCIMA.1999.798545","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798545","url":null,"abstract":"This paper documents experiments which have been carried out with several neural network systems designed to categorise pedestrian shapes from non-pedestrian shapes. Active contour models ('snakes') have been used to obtain contours of pedestrians as they move around the visual field. Neural networks have then been trained on representations of these relaxed snakes, and can successfully discriminate these contours based upon whether they are 'pedestrian' in shape or not.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124100835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798546
P. Pagliani
Information systems unable to uniquely describe each single object and information systems with attributes that cannot be evaluated for some objects, are analysed from a logico-algebraic point of view. It is shown that following purely logical argumentations it is possible to define operations that are suitable for knowledge discovery within these incomplete information systems and for connecting patterns of data.
{"title":"Logico-algebraic structures of partially defined objects and of partially denoting attributes","authors":"P. Pagliani","doi":"10.1109/ICCIMA.1999.798546","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798546","url":null,"abstract":"Information systems unable to uniquely describe each single object and information systems with attributes that cannot be evaluated for some objects, are analysed from a logico-algebraic point of view. It is shown that following purely logical argumentations it is possible to define operations that are suitable for knowledge discovery within these incomplete information systems and for connecting patterns of data.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124862636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798563
G. Hovden, N. Ling
In this paper, we present several speed optimizing techniques in our MPEG-4 video decoder for real-time multimedia applications. Accessor functions and a simple data structure are used. The use of pointers is minimized. Fast algorithms and fixed-point arithmetic are applied whenever possible. Memory accesses are kept to the minimum. Our results show that our software can decode and display MPEG-4 QCIF videos about eight times faster than the existing MoMuSys decoder.
{"title":"On speed optimization of MPEG-4 decoder for real-time multimedia applications","authors":"G. Hovden, N. Ling","doi":"10.1109/ICCIMA.1999.798563","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798563","url":null,"abstract":"In this paper, we present several speed optimizing techniques in our MPEG-4 video decoder for real-time multimedia applications. Accessor functions and a simple data structure are used. The use of pointers is minimized. Fast algorithms and fixed-point arithmetic are applied whenever possible. Memory accesses are kept to the minimum. Our results show that our software can decode and display MPEG-4 QCIF videos about eight times faster than the existing MoMuSys decoder.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130935067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798492
T. Watanabe
The management of long lived transactions is an interesting topic in the application of databases for long term enterprise requests. Some methods have been proposed to manage this type of transaction well: nested transaction model, SAGAS model, etc. However, the models are not always successful at controlling various classes of long lived transactions effectively. We describe our 3-layer model of long lived transaction which was developed to avoid such troublesome problems, and address the management strategy to be constructed under the 3-layer model, based on work flow and task graph.
{"title":"Agent-oriented model for managing long-lived transaction, based on work-flow and task-graph","authors":"T. Watanabe","doi":"10.1109/ICCIMA.1999.798492","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798492","url":null,"abstract":"The management of long lived transactions is an interesting topic in the application of databases for long term enterprise requests. Some methods have been proposed to manage this type of transaction well: nested transaction model, SAGAS model, etc. However, the models are not always successful at controlling various classes of long lived transactions effectively. We describe our 3-layer model of long lived transaction which was developed to avoid such troublesome problems, and address the management strategy to be constructed under the 3-layer model, based on work flow and task graph.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129656467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798500
V. Karri, F. Frost
An important consideration when designing neural network training data is to carefully select those variables that are to be used as inputs. Only those parameters that contribute towards improving the accuracy of the network's prediction should be included as input parameters. Despite a large variety of neural network models, backpropagation (BP) is the most commonly applied model for an extensive range of applications. However, when applying BP networks to process modelling or control, it is necessary to select the correct network architecture and activation functions in order to minimise the computation time and maximise the network's accuracy. In addition, in order to improve network performance, it is necessary to use sufficient training data, spanning a comprehensive input range. While many of the techniques for improving network performance are based on a heuristic approach, some important aspects are detailed in this paper for selecting the optimum network conditions, with respect to computation time and accuracy, using a mathematical function as a sample application.
{"title":"Optimum back propagation network conditions with respect to computation time and output accuracy","authors":"V. Karri, F. Frost","doi":"10.1109/ICCIMA.1999.798500","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798500","url":null,"abstract":"An important consideration when designing neural network training data is to carefully select those variables that are to be used as inputs. Only those parameters that contribute towards improving the accuracy of the network's prediction should be included as input parameters. Despite a large variety of neural network models, backpropagation (BP) is the most commonly applied model for an extensive range of applications. However, when applying BP networks to process modelling or control, it is necessary to select the correct network architecture and activation functions in order to minimise the computation time and maximise the network's accuracy. In addition, in order to improve network performance, it is necessary to use sufficient training data, spanning a comprehensive input range. While many of the techniques for improving network performance are based on a heuristic approach, some important aspects are detailed in this paper for selecting the optimum network conditions, with respect to computation time and accuracy, using a mathematical function as a sample application.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"111 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122390785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-09-23DOI: 10.1109/ICCIMA.1999.798554
S. Shamala, A. Ramani, M. Yazid
In this research, two pro-active dynamic quality of service (QoS) control schemes are designed, the dynamic QoS control scheme with delay estimation, and the hybrid dynamic QoS control scheme. The two schemes aim at fulfilling the requested level of QoS while simultaneously achieving high resource utilization. Emphasis is placed upon buffer management via administration schemes of packet loss and packet delay. The results obtained through the simulation models have shown that the two schemes have significantly improved the average delay for different traffic patterns. The proposed scheme can be adopted for multimedia applications to enhance the QoS in terns of better delay and improved resource utilization.
{"title":"Predictive QoS schemes for real-time multimedia applications in communications","authors":"S. Shamala, A. Ramani, M. Yazid","doi":"10.1109/ICCIMA.1999.798554","DOIUrl":"https://doi.org/10.1109/ICCIMA.1999.798554","url":null,"abstract":"In this research, two pro-active dynamic quality of service (QoS) control schemes are designed, the dynamic QoS control scheme with delay estimation, and the hybrid dynamic QoS control scheme. The two schemes aim at fulfilling the requested level of QoS while simultaneously achieving high resource utilization. Emphasis is placed upon buffer management via administration schemes of packet loss and packet delay. The results obtained through the simulation models have shown that the two schemes have significantly improved the average delay for different traffic patterns. The proposed scheme can be adopted for multimedia applications to enhance the QoS in terns of better delay and improved resource utilization.","PeriodicalId":110736,"journal":{"name":"Proceedings Third International Conference on Computational Intelligence and Multimedia Applications. ICCIMA'99 (Cat. No.PR00300)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126299453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}