首页 > 最新文献

2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)最新文献

英文 中文
Electroencephalogram Stress Classification of Single Electrode using K-means Clustering and Support Vector Machine 基于k均值聚类和支持向量机的单电极脑电信号应力分类
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576794
Tee Yi Wen, S. A. M. Aris, S. A. Jalil, S. Usman
Stress is the body’s natural reaction to life events and chronic stress disrupts the physiological equilibrium of the body which ultimately contributes to a negative impact on physical and mental health. Hence, an endeavor to develop a stress level monitoring system is necessary and important to clinical intervention and disease prevention. Electroencephalography (EEG) acquisition tool was used in this study to capture the brainwave signals at the prefrontal cortex (Fp1 and Fp2) from 50 participants and investigate the brain states related to stress-induced by virtual reality (VR) horror video and intelligence quotient (IQ) test. The collected EEG signals were pre-processed to remove artifacts and the EEG features associated with stress were done through frequency domain analysis to extract power spectral density (PSD) values of Theta, Alpha and Beta frequency bands. The Wilcoxon signed-rank test was carried out to find the significant difference in the absolute power between resting baseline and post-stimuli. The test reported that EEG features using a single electrode, in particular, Theta absolute power was significantly increased at Fp1 electrode (p<0.001) and Fp2 electrode (p<0.015) during post-IQ. Whereas Beta absolute power at Fp2 electrode was observed to significantly increase during both conditions, the post-VR (p<0.024) and post-IQ (p<0.011) respectively. Following this, the significant features were clustered into three groups of stress level using k-means clustering method and the labelled data was fed into support vector machine (SVM) to classify the stress levels. 10-fold cross validation was applied to evaluate the classifier’s performance, with the result confirming the highest performance of 98% accuracy in distinguishing three levels of stress states by using only the feature of Beta-band absolute power from a single electrode (Fp2).
压力是身体对生活事件的自然反应,慢性压力会破坏身体的生理平衡,最终对身心健康产生负面影响。因此,努力开发应激水平监测系统对临床干预和疾病预防是必要和重要的。本研究采用脑电图采集工具采集50名被试前额叶皮层(Fp1和Fp2)的脑电波信号,研究虚拟现实(VR)恐怖视频和智商(IQ)测试引起的压力相关的大脑状态。对采集到的脑电信号进行预处理去除伪影,并对与应力相关的脑电信号特征进行频域分析,提取Theta、Alpha和Beta频段的功率谱密度(PSD)值。采用Wilcoxon符号秩检验来发现静息基线和刺激后的绝对功率有显著差异。测试结果显示,单电极使用后的脑电特征,特别是Fp1电极的θ绝对功率显著增加(p<0.001), Fp2电极的θ绝对功率显著增加(p<0.015)。而Fp2电极的β绝对功率在两种情况下均显著增加,vr后(p<0.024)和iq后(p<0.011)分别显著增加。然后,采用k-means聚类方法将显著特征聚类为三组应力水平,并将标记后的数据输入支持向量机(SVM)进行应力水平分类。使用10倍交叉验证来评估分类器的性能,结果证实仅使用单个电极(Fp2)的β波段绝对功率特征来区分三个水平的应力状态,准确率最高,达到98%。
{"title":"Electroencephalogram Stress Classification of Single Electrode using K-means Clustering and Support Vector Machine","authors":"Tee Yi Wen, S. A. M. Aris, S. A. Jalil, S. Usman","doi":"10.1109/ICSIPA52582.2021.9576794","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576794","url":null,"abstract":"Stress is the body’s natural reaction to life events and chronic stress disrupts the physiological equilibrium of the body which ultimately contributes to a negative impact on physical and mental health. Hence, an endeavor to develop a stress level monitoring system is necessary and important to clinical intervention and disease prevention. Electroencephalography (EEG) acquisition tool was used in this study to capture the brainwave signals at the prefrontal cortex (Fp1 and Fp2) from 50 participants and investigate the brain states related to stress-induced by virtual reality (VR) horror video and intelligence quotient (IQ) test. The collected EEG signals were pre-processed to remove artifacts and the EEG features associated with stress were done through frequency domain analysis to extract power spectral density (PSD) values of Theta, Alpha and Beta frequency bands. The Wilcoxon signed-rank test was carried out to find the significant difference in the absolute power between resting baseline and post-stimuli. The test reported that EEG features using a single electrode, in particular, Theta absolute power was significantly increased at Fp1 electrode (p<0.001) and Fp2 electrode (p<0.015) during post-IQ. Whereas Beta absolute power at Fp2 electrode was observed to significantly increase during both conditions, the post-VR (p<0.024) and post-IQ (p<0.011) respectively. Following this, the significant features were clustered into three groups of stress level using k-means clustering method and the labelled data was fed into support vector machine (SVM) to classify the stress levels. 10-fold cross validation was applied to evaluate the classifier’s performance, with the result confirming the highest performance of 98% accuracy in distinguishing three levels of stress states by using only the feature of Beta-band absolute power from a single electrode (Fp2).","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126219864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spoken Malay Profanity Classification Using Convolutional Neural Network 基于卷积神经网络的马来语脏话分类
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576781
A. Wazir, H. A. Karim, Nouar Aldahoul, M. F. A. Fauzi, Sarina Mansor, Mohd Haris Lye Abdullah, Hor Sui Lyn, Tabibah Zainab Zulkifli
Foul language exists in films, video-sharing platforms, and social media platforms, which increase the risk of a viewer to be exposed to large number of profane words that have negative personal and social impact. This work proposes a CNN-based spoken Malay foul words recognition to establish the base of spoken foul terms detection for monitoring and censorship purpose. A novel foul speech containing 1512 samples are collected, processed, and annotated. The dataset then has been converted into spectral representation of Mel-spectrogram images to be used as an input to CNN model. This research proposes a lightweight CNN model with only six convolutional layers and small size filters to minimize the computational cost. The proposed model’s performance affirms the viability of the proposed visual-based classification method using CNN by achieving an average Malay foul speech terms classification accuracy of 86.50%, precision of 88.68%, and F-score of 86.83. The class of normal conversational class outperformed the class of foul words due to data imbalance and rarity of foul speech samples compared to normal speech terms.
脏话存在于电影、视频分享平台和社交媒体平台中,这增加了观众接触大量脏话的风险,这些脏话对个人和社会都有负面影响。本研究提出了一种基于cnn的马来语口语脏话识别方法,以建立口语脏话检测的基础,用于监控和审查目的。收集、处理并注释了一种包含1512个样本的新颖污言秽语。然后将数据集转换为mel光谱图图像的光谱表示,用作CNN模型的输入。本研究提出了一种轻量级的CNN模型,只有6个卷积层和小尺寸滤波器,以最小化计算成本。该模型的性能证实了本文提出的基于CNN的基于视觉的分类方法的可行性,马来语脏话术语的平均分类准确率为86.50%,精度为88.68%,f分为86.83。由于数据不平衡和脏话样本的稀有性,正常会话类的表现优于脏话类。
{"title":"Spoken Malay Profanity Classification Using Convolutional Neural Network","authors":"A. Wazir, H. A. Karim, Nouar Aldahoul, M. F. A. Fauzi, Sarina Mansor, Mohd Haris Lye Abdullah, Hor Sui Lyn, Tabibah Zainab Zulkifli","doi":"10.1109/ICSIPA52582.2021.9576781","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576781","url":null,"abstract":"Foul language exists in films, video-sharing platforms, and social media platforms, which increase the risk of a viewer to be exposed to large number of profane words that have negative personal and social impact. This work proposes a CNN-based spoken Malay foul words recognition to establish the base of spoken foul terms detection for monitoring and censorship purpose. A novel foul speech containing 1512 samples are collected, processed, and annotated. The dataset then has been converted into spectral representation of Mel-spectrogram images to be used as an input to CNN model. This research proposes a lightweight CNN model with only six convolutional layers and small size filters to minimize the computational cost. The proposed model’s performance affirms the viability of the proposed visual-based classification method using CNN by achieving an average Malay foul speech terms classification accuracy of 86.50%, precision of 88.68%, and F-score of 86.83. The class of normal conversational class outperformed the class of foul words due to data imbalance and rarity of foul speech samples compared to normal speech terms.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130016403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Unified Discriminant and Distribution Alignment for Visual Domain Adaptation 视觉域自适应的统一判别与分布对齐
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576812
M. Samsudin, S. Abu-Bakar, M. Mokji
In visual understandings, images taken from different cameras usually have different resolutions, illumination, poses, and background views that lead to domain shift. Besides labeling these data is an expensive operation. These problems lead to the need for unsupervised domain adaptation (UDA), in which training and testing data are not drawn from the same distribution, and labels are not available in the target domain. This paper presents an improvement for unsupervised domain adaptation in transfer learning using a unified discriminant and distribution alignment (UDDA). The existing method of UDA only utilized unsupervised PCA as the dimensionality reduction process before being added to the joint objective function consisting of distribution discrepancy minimization and regularization. However, the label in the source domain has been utilized by some works (i.e., joint geometrical and statistical (JGSA)) to use the supervised method LDA and show good improvement. Nevertheless, LDA has some drawbacks that is sensitive to noise and outlier in square operations and only take part in global information. The contribution of this paper is to add local discriminant information to the proposed UDA model by adopting locality-sensitive discriminant analysis (LSDA) that strengthens the between-class and within-class discriminant in the source domain. In this method, within-class and between-class graphs will be computed, and summed up with within-class and between-class scatter matrices before being embedded into the joint domain adaptation framework. Comparing the proposed method with the state-of-the-art techniques in objects and digital datasets, it shows our UDDA improved the average accuracy by 3.43% and 7.13% compare with the second and third highest results respectively.
在视觉理解中,从不同相机拍摄的图像通常具有不同的分辨率,照明,姿势和背景视图,从而导致域移位。此外,标记这些数据是一项昂贵的操作。这些问题导致了对无监督域自适应(UDA)的需求,其中训练和测试数据不是从相同的分布中提取的,并且目标域中没有可用的标签。提出了一种基于统一判别和分布对齐(UDDA)的迁移学习中无监督域自适应的改进方法。现有的UDA方法仅利用无监督PCA作为降维过程,然后加入由分布差异最小化和正则化组成的联合目标函数。然而,一些作品(即joint geometric and statistical (JGSA))利用源域中的标签来使用监督方法LDA,并取得了很好的改进。然而,LDA在平方运算中存在对噪声和离群值敏感、只参与全局信息等缺点。本文的贡献在于,通过采用增强源域类间和类内判别的位置敏感判别分析(LSDA),将局部判别信息添加到所提出的UDA模型中。该方法首先计算类内图和类间图,并与类内和类间散点矩阵进行汇总,然后嵌入到联合域自适应框架中。将本文提出的方法与最先进的物体和数字数据集技术进行比较,结果表明,与第二和第三高的结果相比,我们的UDDA分别提高了3.43%和7.13%的平均精度。
{"title":"Unified Discriminant and Distribution Alignment for Visual Domain Adaptation","authors":"M. Samsudin, S. Abu-Bakar, M. Mokji","doi":"10.1109/ICSIPA52582.2021.9576812","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576812","url":null,"abstract":"In visual understandings, images taken from different cameras usually have different resolutions, illumination, poses, and background views that lead to domain shift. Besides labeling these data is an expensive operation. These problems lead to the need for unsupervised domain adaptation (UDA), in which training and testing data are not drawn from the same distribution, and labels are not available in the target domain. This paper presents an improvement for unsupervised domain adaptation in transfer learning using a unified discriminant and distribution alignment (UDDA). The existing method of UDA only utilized unsupervised PCA as the dimensionality reduction process before being added to the joint objective function consisting of distribution discrepancy minimization and regularization. However, the label in the source domain has been utilized by some works (i.e., joint geometrical and statistical (JGSA)) to use the supervised method LDA and show good improvement. Nevertheless, LDA has some drawbacks that is sensitive to noise and outlier in square operations and only take part in global information. The contribution of this paper is to add local discriminant information to the proposed UDA model by adopting locality-sensitive discriminant analysis (LSDA) that strengthens the between-class and within-class discriminant in the source domain. In this method, within-class and between-class graphs will be computed, and summed up with within-class and between-class scatter matrices before being embedded into the joint domain adaptation framework. Comparing the proposed method with the state-of-the-art techniques in objects and digital datasets, it shows our UDDA improved the average accuracy by 3.43% and 7.13% compare with the second and third highest results respectively.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127581141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IRPMID: Medical XRAY Image Impulse Noise Removal using Partition Aided Median, Interpolation and DWT IRPMID:利用分割辅助中值、插值和小波变换去除医学x射线图像脉冲噪声
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576773
M. Shajahan, S. A. M. Aris, S. Usman, N. Noor
The use of dental radiography techniques like XRay has increased the diagnosis efficiency, but the images contain impulse noise. Therefore, denoising is a very important factor in any subjective evaluation of the quality of images. The proposed impulse noise removal method namely ‘Impulse noise removal using Partition aided Median, Interpolation and DWT techniques’ is successfully implemented in the Dental XRAY images. The abbreviation of this method is IRPMID. The techniques such as median filter, interpolation, and Discrete Wavelet Transform (DWT) empower the proposed method with the partition based approach in denoising the salt and pepper noise. This pre-eminent method can diminish the noise range to more than 90%. The dental XRay images can gain a significant improvement in image enhancement in terms of PSNR, IEF, SSIM, etc.
牙科放射技术如x射线的使用提高了诊断效率,但图像中含有脉冲噪声。因此,在对图像质量进行主观评价时,去噪是一个非常重要的因素。提出的脉冲噪声去除方法,即“使用分割辅助中值,插值和DWT技术的脉冲噪声去除”在牙科x射线图像中成功实现。该方法的缩写是IRPMID。中值滤波、插值和离散小波变换(DWT)等技术增强了该方法对椒盐噪声的去噪能力。该方法可将噪声范围减小90%以上。牙科x射线图像在PSNR、IEF、SSIM等方面都有明显的增强效果。
{"title":"IRPMID: Medical XRAY Image Impulse Noise Removal using Partition Aided Median, Interpolation and DWT","authors":"M. Shajahan, S. A. M. Aris, S. Usman, N. Noor","doi":"10.1109/ICSIPA52582.2021.9576773","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576773","url":null,"abstract":"The use of dental radiography techniques like XRay has increased the diagnosis efficiency, but the images contain impulse noise. Therefore, denoising is a very important factor in any subjective evaluation of the quality of images. The proposed impulse noise removal method namely ‘Impulse noise removal using Partition aided Median, Interpolation and DWT techniques’ is successfully implemented in the Dental XRAY images. The abbreviation of this method is IRPMID. The techniques such as median filter, interpolation, and Discrete Wavelet Transform (DWT) empower the proposed method with the partition based approach in denoising the salt and pepper noise. This pre-eminent method can diminish the noise range to more than 90%. The dental XRay images can gain a significant improvement in image enhancement in terms of PSNR, IEF, SSIM, etc.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123058323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Critical Review on Water Level Measurement Techniques for Flood Mitigation 防洪水位测量技术述评
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576793
Obaid Rafiq Jan, H. S. Jo, Riady Siswoyo Jo
Flood related disasters have been the cause of millions of misplaced people and casualties worldwide. Hence it is necessary to develop monitoring systems capable of measuring water levels and depths by means of sensors or camera surveillance. Contact measurement sensors have disadvantages of regular maintenance and high operational costs, whereas camera surveillance combined with computer vision and image processing outweighs the advantages of the former measurement systems. In this study, traditional image processing and some more advanced and current state-of-the-art computer vision techniques have been compared and analyzed according to their implementation.
与洪水有关的灾害已经造成全世界数百万人流离失所和伤亡。因此,有必要开发能够通过传感器或摄像机监视来测量水位和深度的监测系统。接触式测量传感器具有定期维护和高运行成本的缺点,而结合计算机视觉和图像处理的摄像机监控系统则超过了前者测量系统的优点。在这项研究中,传统的图像处理和一些更先进的和当前最先进的计算机视觉技术进行了比较和分析,根据他们的实现。
{"title":"A Critical Review on Water Level Measurement Techniques for Flood Mitigation","authors":"Obaid Rafiq Jan, H. S. Jo, Riady Siswoyo Jo","doi":"10.1109/ICSIPA52582.2021.9576793","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576793","url":null,"abstract":"Flood related disasters have been the cause of millions of misplaced people and casualties worldwide. Hence it is necessary to develop monitoring systems capable of measuring water levels and depths by means of sensors or camera surveillance. Contact measurement sensors have disadvantages of regular maintenance and high operational costs, whereas camera surveillance combined with computer vision and image processing outweighs the advantages of the former measurement systems. In this study, traditional image processing and some more advanced and current state-of-the-art computer vision techniques have been compared and analyzed according to their implementation.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128538495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Road Traffic Sign Detection and Recognition using Adaptive Color Segmentation and Deep Learning 基于自适应颜色分割和深度学习的道路交通标志检测与识别
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576798
Roozbeh KhabiriKhatiri, I. A. Latiff, Ahmad Sabry Mohamad
Traffic sign detection and recognition (TSDR) is one of the main area of research in autonomous vehicles and Advanced Driving Assistance System (ADAS). In this paper a method is proposed to detect and classify the prohibitory subset of German Traffic Sign data set. The traffic sign detection module utilizes adaptive color segmentation based on mean saturation value of local neighborhood, and Circular Hough Transform (CHT) to locate the traffic signs in the input images. Adaptive color thresholding shows improvement in segmenting traffic signs where images have uneven lighting or very high or low contrast levels, compared to global thresholding. Furthermore, number of false alarms are minimized by utilizing an additional validation stage. For the recognition phase of the algorithm, multiple deep Convolutional Neural Networks (CNN) with different structures are developed from scratch to compare their performance and identify the network with highest accuracy.
交通标志检测与识别(TSDR)是自动驾驶汽车和高级驾驶辅助系统(ADAS)的主要研究领域之一。本文提出了一种检测和分类德国交通标志数据集禁止子集的方法。交通标志检测模块采用基于局部邻域平均饱和值的自适应颜色分割,并结合循环霍夫变换(Circular Hough Transform, CHT)对输入图像中的交通标志进行定位。与全局阈值相比,自适应颜色阈值在图像光照不均匀或对比度非常高或非常低的交通标志分割方面显示出改进。此外,通过使用额外的验证阶段,可以最大限度地减少假警报的数量。在算法的识别阶段,从零开始开发多个不同结构的深度卷积神经网络(CNN),比较它们的性能,识别出准确率最高的网络。
{"title":"Road Traffic Sign Detection and Recognition using Adaptive Color Segmentation and Deep Learning","authors":"Roozbeh KhabiriKhatiri, I. A. Latiff, Ahmad Sabry Mohamad","doi":"10.1109/ICSIPA52582.2021.9576798","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576798","url":null,"abstract":"Traffic sign detection and recognition (TSDR) is one of the main area of research in autonomous vehicles and Advanced Driving Assistance System (ADAS). In this paper a method is proposed to detect and classify the prohibitory subset of German Traffic Sign data set. The traffic sign detection module utilizes adaptive color segmentation based on mean saturation value of local neighborhood, and Circular Hough Transform (CHT) to locate the traffic signs in the input images. Adaptive color thresholding shows improvement in segmenting traffic signs where images have uneven lighting or very high or low contrast levels, compared to global thresholding. Furthermore, number of false alarms are minimized by utilizing an additional validation stage. For the recognition phase of the algorithm, multiple deep Convolutional Neural Networks (CNN) with different structures are developed from scratch to compare their performance and identify the network with highest accuracy.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128383779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Region Detection Rate: An Applied Measure for Surface Defect Localization 区域检测率:表面缺陷定位的一种实用方法
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576810
T. Huxohl, F. Kummert
The automatic localization and classification of defects on surfaces helps to ensure the quality of industrially manufactured products. For the development of such automatic systems, a measure is needed that allows a profound optimization and comparison. However, there is currently no measure dedicated to surface defect localization in particular and measures from related fields are unsuitable. Thus, we present the Region Detection Rate (RDR) which is specialized on defect localization since it is evaluated in a defect-wise manner. It entails a set of rules that define the circumstances under which a defect is considered detected and a prediction is considered a false positive. The usability of the RDR is qualitatively demonstrated on examples from three different datasets, one of which has been annotated as part of this work. We hope that the new measure supports the development of future automatic surface defect localization systems and to raise a discussion about the suitability of measures with regard to this task.
表面缺陷的自动定位和分类有助于确保工业制造产品的质量。为了开发这种自动化系统,需要一种能够进行深刻优化和比较的措施。然而,目前还没有专门针对表面缺陷定位的措施,相关领域的措施也不合适。因此,我们提出了区域检测率(RDR),它是专门用于缺陷定位的,因为它是以缺陷的方式进行评估的。它需要一组规则,这些规则定义了在哪些情况下缺陷被认为是检测到的,并且预测被认为是错误的。RDR的可用性通过来自三个不同数据集的示例进行了定性演示,其中一个数据集已作为本工作的一部分进行了注释。我们希望新措施支持未来自动表面缺陷定位系统的发展,并就这项任务提出有关措施适用性的讨论。
{"title":"Region Detection Rate: An Applied Measure for Surface Defect Localization","authors":"T. Huxohl, F. Kummert","doi":"10.1109/ICSIPA52582.2021.9576810","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576810","url":null,"abstract":"The automatic localization and classification of defects on surfaces helps to ensure the quality of industrially manufactured products. For the development of such automatic systems, a measure is needed that allows a profound optimization and comparison. However, there is currently no measure dedicated to surface defect localization in particular and measures from related fields are unsuitable. Thus, we present the Region Detection Rate (RDR) which is specialized on defect localization since it is evaluated in a defect-wise manner. It entails a set of rules that define the circumstances under which a defect is considered detected and a prediction is considered a false positive. The usability of the RDR is qualitatively demonstrated on examples from three different datasets, one of which has been annotated as part of this work. We hope that the new measure supports the development of future automatic surface defect localization systems and to raise a discussion about the suitability of measures with regard to this task.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134450117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sparse Checkerboard Corner Detection from Global Perspective 全局视角下的稀疏棋盘角点检测
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576808
Jiwoo Kang, H. Yoon, Seongmin Lee, Sanghoon Lee
Detecting corners from an image is an essential step for camera calibration in geometric computer vision and image processing applications. In this paper, a novel framework is proposed to detect sparse checkerboard corners with a global context from an image. The proposed framework addresses two major problems that the previous neural network-based corner detection networks have had: locality and non-sparsity. Our framework encodes the global context from an image and uses the context to determine the per-patch existence of the checkerboard. It enables the network to distinguish between the checkerboard pattern and pattern-like noise in the image background while preserving pixel-level detection details. Also, the patch-wise sparse regularization is introduced using counting distribution to obtain clear-cut predictions while maintaining the true positive rate. The experimental results demonstrate that parsing the global context helps the proposed network to decrease false positive detection significantly. Also, the proposed counting regularization improves to detect true positives while decreasing false negatives concurrently. It enables the proposed network to precisely detect sparse checkerboard corners, leading to significant improvements over the state-of-the-art methods.
在几何计算机视觉和图像处理应用中,角点检测是相机标定的重要步骤。本文提出了一种基于全局背景的图像稀疏棋盘角点检测框架。该框架解决了以往基于神经网络的角点检测网络存在的两个主要问题:局部性和非稀疏性。我们的框架从图像中编码全局上下文,并使用上下文来确定每个补丁中棋盘的存在。它使网络能够区分图像背景中的棋盘图案和图案样噪声,同时保留像素级检测细节。在保持真阳性率的同时,引入了基于计数分布的稀疏正则化算法。实验结果表明,分析全局上下文有助于该网络显著减少误报检测。此外,所提出的计数正则化改进了检测真阳性的同时减少假阴性。它使所提出的网络能够精确地检测稀疏的棋盘角,从而大大改进了最先进的方法。
{"title":"Sparse Checkerboard Corner Detection from Global Perspective","authors":"Jiwoo Kang, H. Yoon, Seongmin Lee, Sanghoon Lee","doi":"10.1109/ICSIPA52582.2021.9576808","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576808","url":null,"abstract":"Detecting corners from an image is an essential step for camera calibration in geometric computer vision and image processing applications. In this paper, a novel framework is proposed to detect sparse checkerboard corners with a global context from an image. The proposed framework addresses two major problems that the previous neural network-based corner detection networks have had: locality and non-sparsity. Our framework encodes the global context from an image and uses the context to determine the per-patch existence of the checkerboard. It enables the network to distinguish between the checkerboard pattern and pattern-like noise in the image background while preserving pixel-level detection details. Also, the patch-wise sparse regularization is introduced using counting distribution to obtain clear-cut predictions while maintaining the true positive rate. The experimental results demonstrate that parsing the global context helps the proposed network to decrease false positive detection significantly. Also, the proposed counting regularization improves to detect true positives while decreasing false negatives concurrently. It enables the proposed network to precisely detect sparse checkerboard corners, leading to significant improvements over the state-of-the-art methods.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114943658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dynamic U-Net Using Residual Network for Iris Segmentation 基于残差网络的动态U-Net虹膜分割
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576775
Nurul Amirah Mashudi, N. Ahmad, N. Noor
Biometric applications have taken tremendous attention these days due to technological advancements and the high demand for safety and security systems. Regardless of the existing biometric traits such as fingerprints, palm, face, retina, voice, and gait, the iris is known as the most consistent and precise trait. Iris segmentation is the most significant and essential stage in the iris recognition process. The segmentation method is precisely related to the performance accuracy of iris recognition. In this study, we proposed a Dynamic U-Net using ResNet-34 to improve the segmentation results based on the F1 score. The proposed method would produce a better accuracy on the condition of applying post-processing. However, based on the comparative analysis with other methods in the literature, our proposed method has produced a higher F1 score. The segmentation results were compared with the Unified IrisParseNet. Our proposed method has produced 93.66% accuracy, which higher than Unified IrisParseNet at 93.05%, respectively. The computational time is also high, which can be further improved in future work.
由于技术的进步和对安全系统的高需求,生物识别技术的应用受到了极大的关注。不管现有的生物特征如指纹、手掌、面部、视网膜、声音和步态,虹膜被认为是最一致和精确的特征。虹膜分割是虹膜识别过程中最重要、最关键的阶段。分割方法直接关系到虹膜识别的性能准确性。在这项研究中,我们提出了一种基于ResNet-34的动态U-Net,以改进基于F1分数的分割结果。在进行后处理的条件下,该方法具有较好的精度。然而,通过与文献中其他方法的比较分析,我们提出的方法产生了更高的F1分数。将分割结果与Unified IrisParseNet进行比较。该方法的准确率为93.66%,高于Unified IrisParseNet的93.05%。计算时间也很高,可以在以后的工作中进一步改进。
{"title":"Dynamic U-Net Using Residual Network for Iris Segmentation","authors":"Nurul Amirah Mashudi, N. Ahmad, N. Noor","doi":"10.1109/ICSIPA52582.2021.9576775","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576775","url":null,"abstract":"Biometric applications have taken tremendous attention these days due to technological advancements and the high demand for safety and security systems. Regardless of the existing biometric traits such as fingerprints, palm, face, retina, voice, and gait, the iris is known as the most consistent and precise trait. Iris segmentation is the most significant and essential stage in the iris recognition process. The segmentation method is precisely related to the performance accuracy of iris recognition. In this study, we proposed a Dynamic U-Net using ResNet-34 to improve the segmentation results based on the F1 score. The proposed method would produce a better accuracy on the condition of applying post-processing. However, based on the comparative analysis with other methods in the literature, our proposed method has produced a higher F1 score. The segmentation results were compared with the Unified IrisParseNet. Our proposed method has produced 93.66% accuracy, which higher than Unified IrisParseNet at 93.05%, respectively. The computational time is also high, which can be further improved in future work.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":" 21","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132041603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fusion of 2.5D Face Recognition through Extreme Learning Machine via Manifold Flattening 基于流形平坦化的极限学习机融合2.5D人脸识别
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576768
L. Chong, S. Chong
A flexible feature descriptor, Gabor-based Region Covariance Matrix (GRCM), embeds the Gabor features into the covariance matrix has emerged in face recognition. GRCM locates on Tensor manifold, a non-Euclidean space, utilises distance measures such as Affine-invariant Riemannian Metric (AIRM) and Log-Euclidean Riemannian Metric (LERM) to calculate the distance between two covariance matrices. However, these distance measures are computationally expensive. Therefore, a machine learning approach via manifold flattening is proposed to alleviate the problem. Besides, several feature fusions that integrate the 2.5D partial data and 2D texture image are investigated to boost the recognition rate. Experimental results have exhibited the effectiveness of the proposed method in improving the recognition rate for 2.5D face recognition.
基于Gabor的区域协方差矩阵(GRCM)是一种灵活的特征描述符,它将Gabor特征嵌入到协方差矩阵中。GRCM位于张量流形(张量流形是一个非欧几里德空间)上,利用仿射不变黎曼度量(AIRM)和对数欧几里德黎曼度量(LERM)等距离度量来计算两个协方差矩阵之间的距离。然而,这些距离度量在计算上是昂贵的。因此,提出了一种基于流形平坦化的机器学习方法来缓解这一问题。此外,研究了将2.5D局部数据与二维纹理图像融合的几种特征融合方法,提高了图像的识别率。实验结果表明,该方法能够有效地提高2.5D人脸识别的识别率。
{"title":"Fusion of 2.5D Face Recognition through Extreme Learning Machine via Manifold Flattening","authors":"L. Chong, S. Chong","doi":"10.1109/ICSIPA52582.2021.9576768","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576768","url":null,"abstract":"A flexible feature descriptor, Gabor-based Region Covariance Matrix (GRCM), embeds the Gabor features into the covariance matrix has emerged in face recognition. GRCM locates on Tensor manifold, a non-Euclidean space, utilises distance measures such as Affine-invariant Riemannian Metric (AIRM) and Log-Euclidean Riemannian Metric (LERM) to calculate the distance between two covariance matrices. However, these distance measures are computationally expensive. Therefore, a machine learning approach via manifold flattening is proposed to alleviate the problem. Besides, several feature fusions that integrate the 2.5D partial data and 2D texture image are investigated to boost the recognition rate. Experimental results have exhibited the effectiveness of the proposed method in improving the recognition rate for 2.5D face recognition.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133393494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1