Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576794
Tee Yi Wen, S. A. M. Aris, S. A. Jalil, S. Usman
Stress is the body’s natural reaction to life events and chronic stress disrupts the physiological equilibrium of the body which ultimately contributes to a negative impact on physical and mental health. Hence, an endeavor to develop a stress level monitoring system is necessary and important to clinical intervention and disease prevention. Electroencephalography (EEG) acquisition tool was used in this study to capture the brainwave signals at the prefrontal cortex (Fp1 and Fp2) from 50 participants and investigate the brain states related to stress-induced by virtual reality (VR) horror video and intelligence quotient (IQ) test. The collected EEG signals were pre-processed to remove artifacts and the EEG features associated with stress were done through frequency domain analysis to extract power spectral density (PSD) values of Theta, Alpha and Beta frequency bands. The Wilcoxon signed-rank test was carried out to find the significant difference in the absolute power between resting baseline and post-stimuli. The test reported that EEG features using a single electrode, in particular, Theta absolute power was significantly increased at Fp1 electrode (p<0.001) and Fp2 electrode (p<0.015) during post-IQ. Whereas Beta absolute power at Fp2 electrode was observed to significantly increase during both conditions, the post-VR (p<0.024) and post-IQ (p<0.011) respectively. Following this, the significant features were clustered into three groups of stress level using k-means clustering method and the labelled data was fed into support vector machine (SVM) to classify the stress levels. 10-fold cross validation was applied to evaluate the classifier’s performance, with the result confirming the highest performance of 98% accuracy in distinguishing three levels of stress states by using only the feature of Beta-band absolute power from a single electrode (Fp2).
{"title":"Electroencephalogram Stress Classification of Single Electrode using K-means Clustering and Support Vector Machine","authors":"Tee Yi Wen, S. A. M. Aris, S. A. Jalil, S. Usman","doi":"10.1109/ICSIPA52582.2021.9576794","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576794","url":null,"abstract":"Stress is the body’s natural reaction to life events and chronic stress disrupts the physiological equilibrium of the body which ultimately contributes to a negative impact on physical and mental health. Hence, an endeavor to develop a stress level monitoring system is necessary and important to clinical intervention and disease prevention. Electroencephalography (EEG) acquisition tool was used in this study to capture the brainwave signals at the prefrontal cortex (Fp1 and Fp2) from 50 participants and investigate the brain states related to stress-induced by virtual reality (VR) horror video and intelligence quotient (IQ) test. The collected EEG signals were pre-processed to remove artifacts and the EEG features associated with stress were done through frequency domain analysis to extract power spectral density (PSD) values of Theta, Alpha and Beta frequency bands. The Wilcoxon signed-rank test was carried out to find the significant difference in the absolute power between resting baseline and post-stimuli. The test reported that EEG features using a single electrode, in particular, Theta absolute power was significantly increased at Fp1 electrode (p<0.001) and Fp2 electrode (p<0.015) during post-IQ. Whereas Beta absolute power at Fp2 electrode was observed to significantly increase during both conditions, the post-VR (p<0.024) and post-IQ (p<0.011) respectively. Following this, the significant features were clustered into three groups of stress level using k-means clustering method and the labelled data was fed into support vector machine (SVM) to classify the stress levels. 10-fold cross validation was applied to evaluate the classifier’s performance, with the result confirming the highest performance of 98% accuracy in distinguishing three levels of stress states by using only the feature of Beta-band absolute power from a single electrode (Fp2).","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126219864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576781
A. Wazir, H. A. Karim, Nouar Aldahoul, M. F. A. Fauzi, Sarina Mansor, Mohd Haris Lye Abdullah, Hor Sui Lyn, Tabibah Zainab Zulkifli
Foul language exists in films, video-sharing platforms, and social media platforms, which increase the risk of a viewer to be exposed to large number of profane words that have negative personal and social impact. This work proposes a CNN-based spoken Malay foul words recognition to establish the base of spoken foul terms detection for monitoring and censorship purpose. A novel foul speech containing 1512 samples are collected, processed, and annotated. The dataset then has been converted into spectral representation of Mel-spectrogram images to be used as an input to CNN model. This research proposes a lightweight CNN model with only six convolutional layers and small size filters to minimize the computational cost. The proposed model’s performance affirms the viability of the proposed visual-based classification method using CNN by achieving an average Malay foul speech terms classification accuracy of 86.50%, precision of 88.68%, and F-score of 86.83. The class of normal conversational class outperformed the class of foul words due to data imbalance and rarity of foul speech samples compared to normal speech terms.
{"title":"Spoken Malay Profanity Classification Using Convolutional Neural Network","authors":"A. Wazir, H. A. Karim, Nouar Aldahoul, M. F. A. Fauzi, Sarina Mansor, Mohd Haris Lye Abdullah, Hor Sui Lyn, Tabibah Zainab Zulkifli","doi":"10.1109/ICSIPA52582.2021.9576781","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576781","url":null,"abstract":"Foul language exists in films, video-sharing platforms, and social media platforms, which increase the risk of a viewer to be exposed to large number of profane words that have negative personal and social impact. This work proposes a CNN-based spoken Malay foul words recognition to establish the base of spoken foul terms detection for monitoring and censorship purpose. A novel foul speech containing 1512 samples are collected, processed, and annotated. The dataset then has been converted into spectral representation of Mel-spectrogram images to be used as an input to CNN model. This research proposes a lightweight CNN model with only six convolutional layers and small size filters to minimize the computational cost. The proposed model’s performance affirms the viability of the proposed visual-based classification method using CNN by achieving an average Malay foul speech terms classification accuracy of 86.50%, precision of 88.68%, and F-score of 86.83. The class of normal conversational class outperformed the class of foul words due to data imbalance and rarity of foul speech samples compared to normal speech terms.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130016403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576812
M. Samsudin, S. Abu-Bakar, M. Mokji
In visual understandings, images taken from different cameras usually have different resolutions, illumination, poses, and background views that lead to domain shift. Besides labeling these data is an expensive operation. These problems lead to the need for unsupervised domain adaptation (UDA), in which training and testing data are not drawn from the same distribution, and labels are not available in the target domain. This paper presents an improvement for unsupervised domain adaptation in transfer learning using a unified discriminant and distribution alignment (UDDA). The existing method of UDA only utilized unsupervised PCA as the dimensionality reduction process before being added to the joint objective function consisting of distribution discrepancy minimization and regularization. However, the label in the source domain has been utilized by some works (i.e., joint geometrical and statistical (JGSA)) to use the supervised method LDA and show good improvement. Nevertheless, LDA has some drawbacks that is sensitive to noise and outlier in square operations and only take part in global information. The contribution of this paper is to add local discriminant information to the proposed UDA model by adopting locality-sensitive discriminant analysis (LSDA) that strengthens the between-class and within-class discriminant in the source domain. In this method, within-class and between-class graphs will be computed, and summed up with within-class and between-class scatter matrices before being embedded into the joint domain adaptation framework. Comparing the proposed method with the state-of-the-art techniques in objects and digital datasets, it shows our UDDA improved the average accuracy by 3.43% and 7.13% compare with the second and third highest results respectively.
在视觉理解中,从不同相机拍摄的图像通常具有不同的分辨率,照明,姿势和背景视图,从而导致域移位。此外,标记这些数据是一项昂贵的操作。这些问题导致了对无监督域自适应(UDA)的需求,其中训练和测试数据不是从相同的分布中提取的,并且目标域中没有可用的标签。提出了一种基于统一判别和分布对齐(UDDA)的迁移学习中无监督域自适应的改进方法。现有的UDA方法仅利用无监督PCA作为降维过程,然后加入由分布差异最小化和正则化组成的联合目标函数。然而,一些作品(即joint geometric and statistical (JGSA))利用源域中的标签来使用监督方法LDA,并取得了很好的改进。然而,LDA在平方运算中存在对噪声和离群值敏感、只参与全局信息等缺点。本文的贡献在于,通过采用增强源域类间和类内判别的位置敏感判别分析(LSDA),将局部判别信息添加到所提出的UDA模型中。该方法首先计算类内图和类间图,并与类内和类间散点矩阵进行汇总,然后嵌入到联合域自适应框架中。将本文提出的方法与最先进的物体和数字数据集技术进行比较,结果表明,与第二和第三高的结果相比,我们的UDDA分别提高了3.43%和7.13%的平均精度。
{"title":"Unified Discriminant and Distribution Alignment for Visual Domain Adaptation","authors":"M. Samsudin, S. Abu-Bakar, M. Mokji","doi":"10.1109/ICSIPA52582.2021.9576812","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576812","url":null,"abstract":"In visual understandings, images taken from different cameras usually have different resolutions, illumination, poses, and background views that lead to domain shift. Besides labeling these data is an expensive operation. These problems lead to the need for unsupervised domain adaptation (UDA), in which training and testing data are not drawn from the same distribution, and labels are not available in the target domain. This paper presents an improvement for unsupervised domain adaptation in transfer learning using a unified discriminant and distribution alignment (UDDA). The existing method of UDA only utilized unsupervised PCA as the dimensionality reduction process before being added to the joint objective function consisting of distribution discrepancy minimization and regularization. However, the label in the source domain has been utilized by some works (i.e., joint geometrical and statistical (JGSA)) to use the supervised method LDA and show good improvement. Nevertheless, LDA has some drawbacks that is sensitive to noise and outlier in square operations and only take part in global information. The contribution of this paper is to add local discriminant information to the proposed UDA model by adopting locality-sensitive discriminant analysis (LSDA) that strengthens the between-class and within-class discriminant in the source domain. In this method, within-class and between-class graphs will be computed, and summed up with within-class and between-class scatter matrices before being embedded into the joint domain adaptation framework. Comparing the proposed method with the state-of-the-art techniques in objects and digital datasets, it shows our UDDA improved the average accuracy by 3.43% and 7.13% compare with the second and third highest results respectively.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127581141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576773
M. Shajahan, S. A. M. Aris, S. Usman, N. Noor
The use of dental radiography techniques like XRay has increased the diagnosis efficiency, but the images contain impulse noise. Therefore, denoising is a very important factor in any subjective evaluation of the quality of images. The proposed impulse noise removal method namely ‘Impulse noise removal using Partition aided Median, Interpolation and DWT techniques’ is successfully implemented in the Dental XRAY images. The abbreviation of this method is IRPMID. The techniques such as median filter, interpolation, and Discrete Wavelet Transform (DWT) empower the proposed method with the partition based approach in denoising the salt and pepper noise. This pre-eminent method can diminish the noise range to more than 90%. The dental XRay images can gain a significant improvement in image enhancement in terms of PSNR, IEF, SSIM, etc.
{"title":"IRPMID: Medical XRAY Image Impulse Noise Removal using Partition Aided Median, Interpolation and DWT","authors":"M. Shajahan, S. A. M. Aris, S. Usman, N. Noor","doi":"10.1109/ICSIPA52582.2021.9576773","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576773","url":null,"abstract":"The use of dental radiography techniques like XRay has increased the diagnosis efficiency, but the images contain impulse noise. Therefore, denoising is a very important factor in any subjective evaluation of the quality of images. The proposed impulse noise removal method namely ‘Impulse noise removal using Partition aided Median, Interpolation and DWT techniques’ is successfully implemented in the Dental XRAY images. The abbreviation of this method is IRPMID. The techniques such as median filter, interpolation, and Discrete Wavelet Transform (DWT) empower the proposed method with the partition based approach in denoising the salt and pepper noise. This pre-eminent method can diminish the noise range to more than 90%. The dental XRay images can gain a significant improvement in image enhancement in terms of PSNR, IEF, SSIM, etc.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123058323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576793
Obaid Rafiq Jan, H. S. Jo, Riady Siswoyo Jo
Flood related disasters have been the cause of millions of misplaced people and casualties worldwide. Hence it is necessary to develop monitoring systems capable of measuring water levels and depths by means of sensors or camera surveillance. Contact measurement sensors have disadvantages of regular maintenance and high operational costs, whereas camera surveillance combined with computer vision and image processing outweighs the advantages of the former measurement systems. In this study, traditional image processing and some more advanced and current state-of-the-art computer vision techniques have been compared and analyzed according to their implementation.
{"title":"A Critical Review on Water Level Measurement Techniques for Flood Mitigation","authors":"Obaid Rafiq Jan, H. S. Jo, Riady Siswoyo Jo","doi":"10.1109/ICSIPA52582.2021.9576793","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576793","url":null,"abstract":"Flood related disasters have been the cause of millions of misplaced people and casualties worldwide. Hence it is necessary to develop monitoring systems capable of measuring water levels and depths by means of sensors or camera surveillance. Contact measurement sensors have disadvantages of regular maintenance and high operational costs, whereas camera surveillance combined with computer vision and image processing outweighs the advantages of the former measurement systems. In this study, traditional image processing and some more advanced and current state-of-the-art computer vision techniques have been compared and analyzed according to their implementation.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128538495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576798
Roozbeh KhabiriKhatiri, I. A. Latiff, Ahmad Sabry Mohamad
Traffic sign detection and recognition (TSDR) is one of the main area of research in autonomous vehicles and Advanced Driving Assistance System (ADAS). In this paper a method is proposed to detect and classify the prohibitory subset of German Traffic Sign data set. The traffic sign detection module utilizes adaptive color segmentation based on mean saturation value of local neighborhood, and Circular Hough Transform (CHT) to locate the traffic signs in the input images. Adaptive color thresholding shows improvement in segmenting traffic signs where images have uneven lighting or very high or low contrast levels, compared to global thresholding. Furthermore, number of false alarms are minimized by utilizing an additional validation stage. For the recognition phase of the algorithm, multiple deep Convolutional Neural Networks (CNN) with different structures are developed from scratch to compare their performance and identify the network with highest accuracy.
{"title":"Road Traffic Sign Detection and Recognition using Adaptive Color Segmentation and Deep Learning","authors":"Roozbeh KhabiriKhatiri, I. A. Latiff, Ahmad Sabry Mohamad","doi":"10.1109/ICSIPA52582.2021.9576798","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576798","url":null,"abstract":"Traffic sign detection and recognition (TSDR) is one of the main area of research in autonomous vehicles and Advanced Driving Assistance System (ADAS). In this paper a method is proposed to detect and classify the prohibitory subset of German Traffic Sign data set. The traffic sign detection module utilizes adaptive color segmentation based on mean saturation value of local neighborhood, and Circular Hough Transform (CHT) to locate the traffic signs in the input images. Adaptive color thresholding shows improvement in segmenting traffic signs where images have uneven lighting or very high or low contrast levels, compared to global thresholding. Furthermore, number of false alarms are minimized by utilizing an additional validation stage. For the recognition phase of the algorithm, multiple deep Convolutional Neural Networks (CNN) with different structures are developed from scratch to compare their performance and identify the network with highest accuracy.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128383779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576810
T. Huxohl, F. Kummert
The automatic localization and classification of defects on surfaces helps to ensure the quality of industrially manufactured products. For the development of such automatic systems, a measure is needed that allows a profound optimization and comparison. However, there is currently no measure dedicated to surface defect localization in particular and measures from related fields are unsuitable. Thus, we present the Region Detection Rate (RDR) which is specialized on defect localization since it is evaluated in a defect-wise manner. It entails a set of rules that define the circumstances under which a defect is considered detected and a prediction is considered a false positive. The usability of the RDR is qualitatively demonstrated on examples from three different datasets, one of which has been annotated as part of this work. We hope that the new measure supports the development of future automatic surface defect localization systems and to raise a discussion about the suitability of measures with regard to this task.
{"title":"Region Detection Rate: An Applied Measure for Surface Defect Localization","authors":"T. Huxohl, F. Kummert","doi":"10.1109/ICSIPA52582.2021.9576810","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576810","url":null,"abstract":"The automatic localization and classification of defects on surfaces helps to ensure the quality of industrially manufactured products. For the development of such automatic systems, a measure is needed that allows a profound optimization and comparison. However, there is currently no measure dedicated to surface defect localization in particular and measures from related fields are unsuitable. Thus, we present the Region Detection Rate (RDR) which is specialized on defect localization since it is evaluated in a defect-wise manner. It entails a set of rules that define the circumstances under which a defect is considered detected and a prediction is considered a false positive. The usability of the RDR is qualitatively demonstrated on examples from three different datasets, one of which has been annotated as part of this work. We hope that the new measure supports the development of future automatic surface defect localization systems and to raise a discussion about the suitability of measures with regard to this task.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134450117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576808
Jiwoo Kang, H. Yoon, Seongmin Lee, Sanghoon Lee
Detecting corners from an image is an essential step for camera calibration in geometric computer vision and image processing applications. In this paper, a novel framework is proposed to detect sparse checkerboard corners with a global context from an image. The proposed framework addresses two major problems that the previous neural network-based corner detection networks have had: locality and non-sparsity. Our framework encodes the global context from an image and uses the context to determine the per-patch existence of the checkerboard. It enables the network to distinguish between the checkerboard pattern and pattern-like noise in the image background while preserving pixel-level detection details. Also, the patch-wise sparse regularization is introduced using counting distribution to obtain clear-cut predictions while maintaining the true positive rate. The experimental results demonstrate that parsing the global context helps the proposed network to decrease false positive detection significantly. Also, the proposed counting regularization improves to detect true positives while decreasing false negatives concurrently. It enables the proposed network to precisely detect sparse checkerboard corners, leading to significant improvements over the state-of-the-art methods.
{"title":"Sparse Checkerboard Corner Detection from Global Perspective","authors":"Jiwoo Kang, H. Yoon, Seongmin Lee, Sanghoon Lee","doi":"10.1109/ICSIPA52582.2021.9576808","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576808","url":null,"abstract":"Detecting corners from an image is an essential step for camera calibration in geometric computer vision and image processing applications. In this paper, a novel framework is proposed to detect sparse checkerboard corners with a global context from an image. The proposed framework addresses two major problems that the previous neural network-based corner detection networks have had: locality and non-sparsity. Our framework encodes the global context from an image and uses the context to determine the per-patch existence of the checkerboard. It enables the network to distinguish between the checkerboard pattern and pattern-like noise in the image background while preserving pixel-level detection details. Also, the patch-wise sparse regularization is introduced using counting distribution to obtain clear-cut predictions while maintaining the true positive rate. The experimental results demonstrate that parsing the global context helps the proposed network to decrease false positive detection significantly. Also, the proposed counting regularization improves to detect true positives while decreasing false negatives concurrently. It enables the proposed network to precisely detect sparse checkerboard corners, leading to significant improvements over the state-of-the-art methods.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114943658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576775
Nurul Amirah Mashudi, N. Ahmad, N. Noor
Biometric applications have taken tremendous attention these days due to technological advancements and the high demand for safety and security systems. Regardless of the existing biometric traits such as fingerprints, palm, face, retina, voice, and gait, the iris is known as the most consistent and precise trait. Iris segmentation is the most significant and essential stage in the iris recognition process. The segmentation method is precisely related to the performance accuracy of iris recognition. In this study, we proposed a Dynamic U-Net using ResNet-34 to improve the segmentation results based on the F1 score. The proposed method would produce a better accuracy on the condition of applying post-processing. However, based on the comparative analysis with other methods in the literature, our proposed method has produced a higher F1 score. The segmentation results were compared with the Unified IrisParseNet. Our proposed method has produced 93.66% accuracy, which higher than Unified IrisParseNet at 93.05%, respectively. The computational time is also high, which can be further improved in future work.
{"title":"Dynamic U-Net Using Residual Network for Iris Segmentation","authors":"Nurul Amirah Mashudi, N. Ahmad, N. Noor","doi":"10.1109/ICSIPA52582.2021.9576775","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576775","url":null,"abstract":"Biometric applications have taken tremendous attention these days due to technological advancements and the high demand for safety and security systems. Regardless of the existing biometric traits such as fingerprints, palm, face, retina, voice, and gait, the iris is known as the most consistent and precise trait. Iris segmentation is the most significant and essential stage in the iris recognition process. The segmentation method is precisely related to the performance accuracy of iris recognition. In this study, we proposed a Dynamic U-Net using ResNet-34 to improve the segmentation results based on the F1 score. The proposed method would produce a better accuracy on the condition of applying post-processing. However, based on the comparative analysis with other methods in the literature, our proposed method has produced a higher F1 score. The segmentation results were compared with the Unified IrisParseNet. Our proposed method has produced 93.66% accuracy, which higher than Unified IrisParseNet at 93.05%, respectively. The computational time is also high, which can be further improved in future work.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":" 21","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132041603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.1109/ICSIPA52582.2021.9576768
L. Chong, S. Chong
A flexible feature descriptor, Gabor-based Region Covariance Matrix (GRCM), embeds the Gabor features into the covariance matrix has emerged in face recognition. GRCM locates on Tensor manifold, a non-Euclidean space, utilises distance measures such as Affine-invariant Riemannian Metric (AIRM) and Log-Euclidean Riemannian Metric (LERM) to calculate the distance between two covariance matrices. However, these distance measures are computationally expensive. Therefore, a machine learning approach via manifold flattening is proposed to alleviate the problem. Besides, several feature fusions that integrate the 2.5D partial data and 2D texture image are investigated to boost the recognition rate. Experimental results have exhibited the effectiveness of the proposed method in improving the recognition rate for 2.5D face recognition.
{"title":"Fusion of 2.5D Face Recognition through Extreme Learning Machine via Manifold Flattening","authors":"L. Chong, S. Chong","doi":"10.1109/ICSIPA52582.2021.9576768","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576768","url":null,"abstract":"A flexible feature descriptor, Gabor-based Region Covariance Matrix (GRCM), embeds the Gabor features into the covariance matrix has emerged in face recognition. GRCM locates on Tensor manifold, a non-Euclidean space, utilises distance measures such as Affine-invariant Riemannian Metric (AIRM) and Log-Euclidean Riemannian Metric (LERM) to calculate the distance between two covariance matrices. However, these distance measures are computationally expensive. Therefore, a machine learning approach via manifold flattening is proposed to alleviate the problem. Besides, several feature fusions that integrate the 2.5D partial data and 2D texture image are investigated to boost the recognition rate. Experimental results have exhibited the effectiveness of the proposed method in improving the recognition rate for 2.5D face recognition.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133393494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}