首页 > 最新文献

2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)最新文献

英文 中文
Design and Optimization of Homomorphic Medical Image Fusion Algorithm 同态医学图像融合算法的设计与优化
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576802
M. K. Awang, Muhd Aliff Haiqal Mohd Marzuki, Nurul Kamilah Mat Kamil
Multimodal image fusion is a method of fusing images of different modalities into one image without losing the overall meaning of the input images. The homomorphic method is one method to enhance digital images by increasing the high-frequency image signals and reducing the low-frequency of unwanted illumination. This paper will demonstrate how the homomorphic fusion method can improve the fused image quality compared to basic fusion methods such as principal component analysis (PCA) and discrete wavelet transform (DWT) methods. The design and simulation are carried out by MATLAB software on selected medical modalities, MR-PET and MRI. The results are compared using Mutual Information (MI) with aforementioned methods. The results showed that the homomorphic method has higher efficiency than DWT and PCA methods.
多模态图像融合是一种将不同模态的图像融合成一幅图像而不丢失输入图像整体意义的方法。同态方法是一种通过增加高频图像信号和减少低频无用光照来增强数字图像的方法。与主成分分析(PCA)和离散小波变换(DWT)等基本融合方法相比,本文将展示同态融合方法如何提高融合后的图像质量。利用MATLAB软件对选定的医学模式、MR-PET和MRI进行了设计和仿真。利用互信息(MI)方法与上述方法进行了比较。结果表明,同态方法比小波变换和主成分分析方法具有更高的效率。
{"title":"Design and Optimization of Homomorphic Medical Image Fusion Algorithm","authors":"M. K. Awang, Muhd Aliff Haiqal Mohd Marzuki, Nurul Kamilah Mat Kamil","doi":"10.1109/ICSIPA52582.2021.9576802","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576802","url":null,"abstract":"Multimodal image fusion is a method of fusing images of different modalities into one image without losing the overall meaning of the input images. The homomorphic method is one method to enhance digital images by increasing the high-frequency image signals and reducing the low-frequency of unwanted illumination. This paper will demonstrate how the homomorphic fusion method can improve the fused image quality compared to basic fusion methods such as principal component analysis (PCA) and discrete wavelet transform (DWT) methods. The design and simulation are carried out by MATLAB software on selected medical modalities, MR-PET and MRI. The results are compared using Mutual Information (MI) with aforementioned methods. The results showed that the homomorphic method has higher efficiency than DWT and PCA methods.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115855427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fast and Unbiased Minimalistic Resampling Approach for the Particle Filter 粒子滤波的快速无偏极小重采样方法
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576807
R. Gurajala, P. Choppala, J. Meka, Paul D. Teal
The particle filter is an important approximation method for online state estimation in nonlinear nonGaussian scenarios. The resampling step in the particle filter is critical because it eliminates the wasteful use of particles that do not contribute to the posterior (degeneracy). The fully stochastic resamplers, despite being unbiased in approximating the posterior density, involve exhaustive and sequential communication within the particles and thus are computationally expensive. The alternate partial deterministic resamplers overcome this problem by reducing the communication within particles but this leads to approximation bias. This paper proposes a fast resampling procedure that gives an accurate approximation of the posterior and tracks as accurately as the conventional resamplers.
粒子滤波是非线性非高斯状态在线估计的一种重要逼近方法。在粒子滤波器的重采样步骤是至关重要的,因为它消除了浪费使用的粒子,不有助于后验(简并)。完全随机重采样,尽管在近似后验密度方面是无偏的,但涉及粒子内部的穷尽和顺序通信,因此计算成本很高。交替的部分确定性重采样器通过减少粒子间的通信来克服这一问题,但这会导致近似偏差。本文提出了一种快速重采样方法,该方法可以精确地近似后验,并与传统的重采样器一样精确地跟踪。
{"title":"A Fast and Unbiased Minimalistic Resampling Approach for the Particle Filter","authors":"R. Gurajala, P. Choppala, J. Meka, Paul D. Teal","doi":"10.1109/ICSIPA52582.2021.9576807","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576807","url":null,"abstract":"The particle filter is an important approximation method for online state estimation in nonlinear nonGaussian scenarios. The resampling step in the particle filter is critical because it eliminates the wasteful use of particles that do not contribute to the posterior (degeneracy). The fully stochastic resamplers, despite being unbiased in approximating the posterior density, involve exhaustive and sequential communication within the particles and thus are computationally expensive. The alternate partial deterministic resamplers overcome this problem by reducing the communication within particles but this leads to approximation bias. This paper proposes a fast resampling procedure that gives an accurate approximation of the posterior and tracks as accurately as the conventional resamplers.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114466973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of Dental Caries Level Images Classification Performance using KNN and SVM Methods 基于KNN和SVM方法的龋齿图像分类性能比较
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576774
Y. Jusman, M. K. Anam, Sartika Puspita, Edwyn Saleh, S. N. A. Kanafiah, Rhesezia Intan Tamarena
This study aims to build a dental caries level classification system based on image processing (i.e. to extract texture features) and machine learning methods. The first step was to analyze and discover the extraction results from Gray Level Co-Occurrence Matrix algorithm. After successfully extracting the features, the classification was carried out using a Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). Both machine learnings are analyzed and used to obtain the better alternatives of the classification results. This study employed radiographic images of four dental caries classes consisting of Class 1, 2, 3, and 4. Total of images used after pre-processing are 396 images. Training data is 90% of total images then the rest is the testing data. The classification obtained accuracy value of the SVM and KNN. The SVM classification method revealed the highest accuracy value generated by the Fine Gaussian SVM model was 95.7%. Conversely, the lowest accuracy value generated was 83.3%, derived from the Quadratic SVM model. Meanwhile, the highest accuracy by using KNN is 94.9% of accuracy using Fine and lowest accuracy value generated was 91.4%, derived from Weighted KNN models. The KNN classification results are better than the SVM results.
本研究旨在建立一个基于图像处理(即提取纹理特征)和机器学习方法的龋齿级别分类系统。首先对灰度共生矩阵算法的提取结果进行分析和发现。在成功提取特征后,使用支持向量机(SVM)和k近邻(KNN)进行分类。对两种机器学习进行分析并用于获得分类结果的更好替代。本研究采用1、2、3、4级龋的x线图像。预处理后使用的图像总数为396张。训练数据占总图像的90%,剩下的是测试数据。分类得到SVM和KNN的准确率值。SVM分类方法显示,细高斯SVM模型生成的最高准确率值为95.7%。相反,生成的最低准确率值为83.3%,来源于二次型SVM模型。同时,使用KNN模型产生的最高精度为Fine模型的94.9%,加权KNN模型产生的最低精度值为91.4%。KNN分类结果优于SVM分类结果。
{"title":"Comparison of Dental Caries Level Images Classification Performance using KNN and SVM Methods","authors":"Y. Jusman, M. K. Anam, Sartika Puspita, Edwyn Saleh, S. N. A. Kanafiah, Rhesezia Intan Tamarena","doi":"10.1109/ICSIPA52582.2021.9576774","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576774","url":null,"abstract":"This study aims to build a dental caries level classification system based on image processing (i.e. to extract texture features) and machine learning methods. The first step was to analyze and discover the extraction results from Gray Level Co-Occurrence Matrix algorithm. After successfully extracting the features, the classification was carried out using a Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). Both machine learnings are analyzed and used to obtain the better alternatives of the classification results. This study employed radiographic images of four dental caries classes consisting of Class 1, 2, 3, and 4. Total of images used after pre-processing are 396 images. Training data is 90% of total images then the rest is the testing data. The classification obtained accuracy value of the SVM and KNN. The SVM classification method revealed the highest accuracy value generated by the Fine Gaussian SVM model was 95.7%. Conversely, the lowest accuracy value generated was 83.3%, derived from the Quadratic SVM model. Meanwhile, the highest accuracy by using KNN is 94.9% of accuracy using Fine and lowest accuracy value generated was 91.4%, derived from Weighted KNN models. The KNN classification results are better than the SVM results.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114982722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Classification of Digital Chess Pieces and Board Position using SIFT 使用SIFT对数字棋子和棋盘位置进行分类
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576797
Brandon Sean Kong, I. Hipiny, Hamimah Ujir
Assistive technology has been given more attention in recent years to help people with disabilities to perform common tasks. Rather than designing a specialised tool for the task, it is more cost-effective and less inhibitory to make use of existing hardware integrated with a smart interface. Towards this end goal, we present our work on assisting a visually impaired person playing an online chess game. We evaluated an invariant feature descriptor, i.e., SIFT, for the task of classifying individual chess pieces across multiple visual themes. We compared two strategies for building the visual codebook, i.e., k-means clustering vs. image blending. The proposed pipeline receives live screen feeds from the browser at a fixed interval and produces an output in the form of chess pieces’ label and board position. Our proposed pipeline, paired with a visual codebook built using k-means clustering, managed an average accuracy rate of 6/10.
近年来,辅助技术越来越受到人们的关注,以帮助残疾人完成日常任务。与其为这项任务设计一个专门的工具,不如利用与智能接口集成的现有硬件更经济、更少的阻碍。为了实现这一最终目标,我们展示了帮助视障人士玩在线国际象棋游戏的工作。我们评估了一个不变的特征描述符,即SIFT,用于跨多个视觉主题对单个棋子进行分类的任务。我们比较了两种构建视觉码本的策略,即k-means聚类和图像混合。建议的管道以固定的间隔接收来自浏览器的实时屏幕提要,并以棋子的标签和棋盘位置的形式产生输出。我们提出的管道与使用k-means聚类构建的视觉码本配对,平均准确率为6/10。
{"title":"Classification of Digital Chess Pieces and Board Position using SIFT","authors":"Brandon Sean Kong, I. Hipiny, Hamimah Ujir","doi":"10.1109/ICSIPA52582.2021.9576797","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576797","url":null,"abstract":"Assistive technology has been given more attention in recent years to help people with disabilities to perform common tasks. Rather than designing a specialised tool for the task, it is more cost-effective and less inhibitory to make use of existing hardware integrated with a smart interface. Towards this end goal, we present our work on assisting a visually impaired person playing an online chess game. We evaluated an invariant feature descriptor, i.e., SIFT, for the task of classifying individual chess pieces across multiple visual themes. We compared two strategies for building the visual codebook, i.e., k-means clustering vs. image blending. The proposed pipeline receives live screen feeds from the browser at a fixed interval and produces an output in the form of chess pieces’ label and board position. Our proposed pipeline, paired with a visual codebook built using k-means clustering, managed an average accuracy rate of 6/10.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130717270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Inclusion of Climate Variables for Dengue Prediction Model: Preliminary Analysis 登革热预测模型中气候变量的纳入:初步分析
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576776
Loshini Thiruchelvam, S. Dass, Nirbhay Mathur, V. Asirvadam, B. Gill
This study aimed to build best dengue cases prediction model for Petaling district, in Selangor. Linear Least Square estimation method is used to build the models and Mean Square Error (MSE) and Akaike Information Criterion (AIC) value is used as tool of comparison between models. Prior to model development, the respective variables are first normalized, using 0–1 normalization procedure. Next, significant predictors are identified from weather variables namely mean temperature, relative humidity, and rainfall. Thirdly, feedback data was included and identified if could yield better prediction models. Few model orders of lag time are built simultaneously, and the most accurate prediction model was selected for Petaling district. Study found dengue prediction models including all three climate variables of mean temperature, relative humidity, cumulative rainfall and together with previous dengue cases to have the lowest MSE and AIC values. This is aligned with previous studies which selected model with climate and previous dengue cases models as best model fit. Thus, study proposed future studies to incorporate all three climate variables and previous dengue cases while developing dengue prediction models.
本研究旨在建立雪兰莪州Petaling地区登革热病例的最佳预测模型。采用线性最小二乘估计方法建立模型,采用均方误差(MSE)和赤池信息准则(Akaike Information Criterion, AIC)值作为模型间的比较工具。在模型开发之前,首先使用0-1规范化过程对各个变量进行规范化。接下来,从天气变量即平均温度、相对湿度和降雨量中确定重要的预测因子。第三,纳入反馈数据,并确定是否可以产生更好的预测模型。同时建立了几个滞后时间的模型阶数,选取了最准确的花瓣陵区预测模型。研究发现,登革热预测模型包括平均温度、相对湿度、累积降雨量这三个气候变量,并结合以往登革热病例,其MSE和AIC值最低。这与以前的研究一致,这些研究选择了气候模型和以前的登革热病例模型作为最佳模型拟合。因此,研究人员建议未来的研究在开发登革热预测模型时纳入所有三个气候变量和以前的登革热病例。
{"title":"Inclusion of Climate Variables for Dengue Prediction Model: Preliminary Analysis","authors":"Loshini Thiruchelvam, S. Dass, Nirbhay Mathur, V. Asirvadam, B. Gill","doi":"10.1109/ICSIPA52582.2021.9576776","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576776","url":null,"abstract":"This study aimed to build best dengue cases prediction model for Petaling district, in Selangor. Linear Least Square estimation method is used to build the models and Mean Square Error (MSE) and Akaike Information Criterion (AIC) value is used as tool of comparison between models. Prior to model development, the respective variables are first normalized, using 0–1 normalization procedure. Next, significant predictors are identified from weather variables namely mean temperature, relative humidity, and rainfall. Thirdly, feedback data was included and identified if could yield better prediction models. Few model orders of lag time are built simultaneously, and the most accurate prediction model was selected for Petaling district. Study found dengue prediction models including all three climate variables of mean temperature, relative humidity, cumulative rainfall and together with previous dengue cases to have the lowest MSE and AIC values. This is aligned with previous studies which selected model with climate and previous dengue cases models as best model fit. Thus, study proposed future studies to incorporate all three climate variables and previous dengue cases while developing dengue prediction models.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129126549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comparative Analysis of Explainable Artificial Intelligence for COVID-19 Diagnosis on CXR Image 可解释人工智能在CXR图像上诊断COVID-19的比较分析
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576766
Joe Huei Ong, Kam Meng Goh, Li Li Lim
The COVID-19 outbreak brought a huge impact globally. Early studies show that the COVID-19 is manifested in chest X-rays of infected patients. Hence, these studies attract the attention of the computer vision community in integrating X-ray scans and deep-learning-based solutions to aid the diagnosis of COVID-19 infection. However, at present, efforts and information on implementing explainable artificial intelligence in interpreting deep learning model for COVID-19 recognition are scarce and limited. In this paper, we proposed and compared the LIME and SHAP model to enhance the interpretation of COVID diagnosis through X-ray scans. We first applied SqueezeNet to recognise pneumonia, COVID-19, and normal lung image. Through SqueezeNet, an 84.34% recognition rate success in testing accuracy was obtained. To better understand what the network “sees” a specific task, namely, image classification, Shapley Additive Explanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) were implemented to expound and interpret how Squeezenet performs classification. Results show that LIME and SHAP can highlight the area of interest where they can help to increase the transparency and the interpretability of the Squeezenet model.
新冠肺炎疫情在全球范围内造成巨大影响。早期研究表明,COVID-19在感染患者的胸部x光片中表现出来。因此,这些研究将x射线扫描和基于深度学习的解决方案结合起来,以帮助诊断COVID-19感染,引起了计算机视觉界的关注。然而,目前,在解释COVID-19识别的深度学习模型中实施可解释人工智能的努力和信息非常有限。在本文中,我们提出并比较了LIME和SHAP模型,以增强通过x射线扫描诊断COVID的解释。我们首先应用SqueezeNet识别肺炎、COVID-19和正常肺部图像。通过SqueezeNet,获得了84.34%的测试准确率识别率。为了更好地理解网络“看到”了一个特定的任务,即图像分类,我们使用Shapley Additive Explanation (SHAP)和Local Interpretable Model-Agnostic Explanations (LIME)来阐述和解释Squeezenet是如何进行分类的。结果表明,LIME和SHAP可以突出感兴趣的领域,它们可以帮助提高Squeezenet模型的透明度和可解释性。
{"title":"Comparative Analysis of Explainable Artificial Intelligence for COVID-19 Diagnosis on CXR Image","authors":"Joe Huei Ong, Kam Meng Goh, Li Li Lim","doi":"10.1109/ICSIPA52582.2021.9576766","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576766","url":null,"abstract":"The COVID-19 outbreak brought a huge impact globally. Early studies show that the COVID-19 is manifested in chest X-rays of infected patients. Hence, these studies attract the attention of the computer vision community in integrating X-ray scans and deep-learning-based solutions to aid the diagnosis of COVID-19 infection. However, at present, efforts and information on implementing explainable artificial intelligence in interpreting deep learning model for COVID-19 recognition are scarce and limited. In this paper, we proposed and compared the LIME and SHAP model to enhance the interpretation of COVID diagnosis through X-ray scans. We first applied SqueezeNet to recognise pneumonia, COVID-19, and normal lung image. Through SqueezeNet, an 84.34% recognition rate success in testing accuracy was obtained. To better understand what the network “sees” a specific task, namely, image classification, Shapley Additive Explanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) were implemented to expound and interpret how Squeezenet performs classification. Results show that LIME and SHAP can highlight the area of interest where they can help to increase the transparency and the interpretability of the Squeezenet model.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134351709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic Diagnosis and Prediction of Cognitive Decline Associated with Alzheimer’s Dementia through Spontaneous Speech 通过自发言语自动诊断和预测阿尔茨海默氏痴呆症相关认知衰退
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576784
Ziming Liu, Lauren Proctor, Parker N. Collier, Xiaopeng Zhao
With the increasing prevalence of Alzheimer’s disease (AD), it is important to develop detectable biomarkers to reliably identify AD in the early stage. Language deficit is one of the common signs that appear in the early stage of mild Alzheimer’s disease. Therefore, using natural language processing and related machine learning algorithms for AD diagnosis using patients’ speech recordings has drawn more attention in recent years. In this study, three approaches are proposed to extract features through speech recording in this model: (1) using fine-tuning pre-trained encoder model (BERT) for transcripts from automatic transcription, (2) hand-crafted linguistic features for transcripts from automatic transcription, and (3) selected acoustic features for denoised speech recordings. The three designed approaches are applied to three tasks: AD diagnosis, MMSE score prediction, and cognitive decline inference. The approach using BERT yields the best performance in all three challenge tasks based on cross-validation results using the training dataset. Specifically, in the AD diagnosis task, 5-fold cross-validation using encoded features based on transcripts generated from Deep Speech yields an average classification accuracy of 97.18%. In the MMSE score prediction task, 5-fold cross-validation using BERT encoded features based on transcripts generated from Deep Speech yields an average Root Mean Squared Error (RMSE) of 3.76. In the cognitive decline inference task, the leave-one-out cross-validation using BERT encoded features based on transcripts generated from Sphinx or Deep Speech yields an average classification accuracy of 100%. The analyses suggest that the combination of automatic transcription and BERT may produce a significant performance in AD related detection and prediction problems.
随着阿尔茨海默病(AD)患病率的增加,开发可检测的生物标志物以在早期可靠地识别AD非常重要。语言障碍是出现在轻度阿尔茨海默病早期的常见症状之一。因此,利用自然语言处理及相关机器学习算法,利用患者语音录音进行AD诊断,近年来备受关注。在本研究中,该模型提出了三种通过语音记录提取特征的方法:(1)对自动转录的转录本使用微调预训练编码器模型(BERT),(2)对自动转录的转录本手工制作语言特征,(3)对去噪的语音记录选择声学特征。设计的三种方法应用于三个任务:AD诊断,MMSE评分预测和认知衰退推断。基于使用训练数据集的交叉验证结果,使用BERT的方法在所有三个挑战任务中产生最佳性能。具体来说,在AD诊断任务中,使用基于Deep Speech生成的转录本的编码特征进行5次交叉验证,平均分类准确率为97.18%。在MMSE评分预测任务中,使用基于深度语音生成的转录本的BERT编码特征进行5次交叉验证,平均均方根误差(RMSE)为3.76。在认知衰退推理任务中,使用基于Sphinx或Deep Speech生成的转录本的BERT编码特征进行留一交叉验证,平均分类准确率为100%。分析表明,自动转录和BERT的结合可能在AD相关的检测和预测问题上产生显著的性能。
{"title":"Automatic Diagnosis and Prediction of Cognitive Decline Associated with Alzheimer’s Dementia through Spontaneous Speech","authors":"Ziming Liu, Lauren Proctor, Parker N. Collier, Xiaopeng Zhao","doi":"10.1109/ICSIPA52582.2021.9576784","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576784","url":null,"abstract":"With the increasing prevalence of Alzheimer’s disease (AD), it is important to develop detectable biomarkers to reliably identify AD in the early stage. Language deficit is one of the common signs that appear in the early stage of mild Alzheimer’s disease. Therefore, using natural language processing and related machine learning algorithms for AD diagnosis using patients’ speech recordings has drawn more attention in recent years. In this study, three approaches are proposed to extract features through speech recording in this model: (1) using fine-tuning pre-trained encoder model (BERT) for transcripts from automatic transcription, (2) hand-crafted linguistic features for transcripts from automatic transcription, and (3) selected acoustic features for denoised speech recordings. The three designed approaches are applied to three tasks: AD diagnosis, MMSE score prediction, and cognitive decline inference. The approach using BERT yields the best performance in all three challenge tasks based on cross-validation results using the training dataset. Specifically, in the AD diagnosis task, 5-fold cross-validation using encoded features based on transcripts generated from Deep Speech yields an average classification accuracy of 97.18%. In the MMSE score prediction task, 5-fold cross-validation using BERT encoded features based on transcripts generated from Deep Speech yields an average Root Mean Squared Error (RMSE) of 3.76. In the cognitive decline inference task, the leave-one-out cross-validation using BERT encoded features based on transcripts generated from Sphinx or Deep Speech yields an average classification accuracy of 100%. The analyses suggest that the combination of automatic transcription and BERT may produce a significant performance in AD related detection and prediction problems.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121800969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Emotion Recognition Using Bahasa Malaysia Natural Speech 利用马来语自然语音进行情绪识别
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576788
Afiq Aiman Ahmad Fairuz Rizal, N. Hashim
In recent years, the technology of emotion speech recognition has gradually become more important to the industries. This is proven by the integration of this system into many applications such as the interface with robots, audio surveillance, web-based E-learning, commercial applications, clinical studies and so on. Generally, speech emotion recognition (SER) is developed to help humans to understand and retrieve desired emotions. In this research, the analysis of using Bahasa Malaysia Language for three basic emotions of happy, sad and angry was analyzed. A total of 30 male and 30 female audio recordings were collected. Mel-frequency cepstral coefficient, chroma and mel spectrogram features were extracted. Feature dimensions were reduced using forward, backward and exhaustive selection methods before classification. Classification was performed using K-nearest neighbors, Support Vector Machine and Random Forest. The analysis demonstrated 78% accuracy for male speech and 78% for female speech.
近年来,情感语音识别技术逐渐成为各行业关注的焦点。通过将该系统集成到许多应用中,如与机器人的接口、音频监控、基于web的电子学习、商业应用、临床研究等,证明了这一点。一般来说,语音情感识别(SER)的发展是为了帮助人类理解和检索所需的情感。在本研究中,分析了马来语对快乐、悲伤和愤怒三种基本情绪的使用情况。共收集了30份男性和30份女性的录音。提取了mel频率倒谱系数、色度和mel谱图特征。在分类前分别采用前向、后向和穷举选择方法对特征维数进行降维。使用k近邻、支持向量机和随机森林进行分类。分析表明,男性语音准确率为78%,女性语音准确率为78%。
{"title":"Emotion Recognition Using Bahasa Malaysia Natural Speech","authors":"Afiq Aiman Ahmad Fairuz Rizal, N. Hashim","doi":"10.1109/ICSIPA52582.2021.9576788","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576788","url":null,"abstract":"In recent years, the technology of emotion speech recognition has gradually become more important to the industries. This is proven by the integration of this system into many applications such as the interface with robots, audio surveillance, web-based E-learning, commercial applications, clinical studies and so on. Generally, speech emotion recognition (SER) is developed to help humans to understand and retrieve desired emotions. In this research, the analysis of using Bahasa Malaysia Language for three basic emotions of happy, sad and angry was analyzed. A total of 30 male and 30 female audio recordings were collected. Mel-frequency cepstral coefficient, chroma and mel spectrogram features were extracted. Feature dimensions were reduced using forward, backward and exhaustive selection methods before classification. Classification was performed using K-nearest neighbors, Support Vector Machine and Random Forest. The analysis demonstrated 78% accuracy for male speech and 78% for female speech.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"41 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125779864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Combiner Circuit Of Multi Network Operator For Capacity Enhancement Solution In Indoor Environment 室内环境下多网运营商容量增强方案的混合组合电路
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576767
S. S. Sarnin, M. Yusuf, Ros Shilawani S. Abdul Kadir, N. F. Naim, W. N. W. Mohamad, Mohd Nor Md Tan
This research focuses on providing a solution for a mobile service provider with Multi Network Operators (MNOs) using a single multi-beam antenna via a hybrid circuit to provide an excellent service attended by thousands of Mobile Subscribers (MS) at Nasional Bukit Jalil Kuala Lumpur Stadium. The combination circuit design using the Hybrid Combiner (HC) is the solution used to combine multiple MNOs towards minimizing space and cost besides ensuring the aesthetical value of the national stadium. During a significant incident, MS users may have trouble accessing the service due to unavailability of the service due to network congestion. In this situation, the MNOs must have additional capacity to meet the demand for data transmission and voice call transactions. Improving the output of the network and the quality of service should reflect customer loyalty as it automatically produces. The implementation of the proposed solution, MS users will be able to access the network and will also enjoy live feeds via Facebook (FB) and other software applications without delay and interruption as well as voice call congestion. The results of the suggested solution will be compared with the walk test results and the coverage simulation analysis using the planning methods. Data statistics taken from MNOs will explain the effectiveness of the solution in term of Signal quality level where the Signal to Noise Ratio (SINR) recorded at −95 dBm below the threshold of −85 dBm to prevent interference with MS users. The Resource Block (RB) Utilization shows the utilization of all sectors are at below 70% of total available capacity which means that the congestion level is manageable and MS user able to access the network without interruption. Fast deployment, less maintenance and a shared solution between MNOs is a key factor in the proposed study and is known as Hybrid Combiner Circuit of Multi Network Operator for Capacity Enhancement Solution in Indoor Environment.
本研究的重点是为拥有多网络运营商(MNOs)的移动服务提供商提供一种解决方案,该解决方案通过混合电路使用单个多波束天线,在国家武吉加里尔吉隆坡体育场提供数千名移动用户(MS)参加的优质服务。采用Hybrid Combiner (HC)的组合电路设计是将多个mno组合在一起的解决方案,在保证国家体育场美观的同时,最大限度地减少空间和成本。在重大事件期间,由于网络拥塞导致服务不可用,MS用户可能在访问服务时遇到困难。在这种情况下,移动网络运营商必须有额外的容量来满足数据传输和语音呼叫事务的需求。提高网络的产出和服务质量应该反映顾客的忠诚度,因为它会自动产生。该解决方案实施后,微软用户将能够访问网络,并通过Facebook和其他软件应用程序享受直播,而不会出现延迟和中断,也不会出现语音通话拥塞。将建议的解决方案的结果与步行测试结果和使用规划方法的覆盖仿真分析进行比较。从移动网络运营商获取的数据统计将解释该解决方案在信号质量水平方面的有效性,其中信噪比(SINR)记录在- 95 dBm,低于- 85 dBm的阈值,以防止对MS用户的干扰。资源块(Resource Block, RB)利用率显示所有扇区的利用率低于总可用容量的70%,这意味着拥塞级别是可管理的,MS用户能够不中断地访问网络。快速部署、少维护和mno之间的共享解决方案是该研究的关键因素,被称为室内环境下多网络运营商混合组合电路的容量增强解决方案。
{"title":"Hybrid Combiner Circuit Of Multi Network Operator For Capacity Enhancement Solution In Indoor Environment","authors":"S. S. Sarnin, M. Yusuf, Ros Shilawani S. Abdul Kadir, N. F. Naim, W. N. W. Mohamad, Mohd Nor Md Tan","doi":"10.1109/ICSIPA52582.2021.9576767","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576767","url":null,"abstract":"This research focuses on providing a solution for a mobile service provider with Multi Network Operators (MNOs) using a single multi-beam antenna via a hybrid circuit to provide an excellent service attended by thousands of Mobile Subscribers (MS) at Nasional Bukit Jalil Kuala Lumpur Stadium. The combination circuit design using the Hybrid Combiner (HC) is the solution used to combine multiple MNOs towards minimizing space and cost besides ensuring the aesthetical value of the national stadium. During a significant incident, MS users may have trouble accessing the service due to unavailability of the service due to network congestion. In this situation, the MNOs must have additional capacity to meet the demand for data transmission and voice call transactions. Improving the output of the network and the quality of service should reflect customer loyalty as it automatically produces. The implementation of the proposed solution, MS users will be able to access the network and will also enjoy live feeds via Facebook (FB) and other software applications without delay and interruption as well as voice call congestion. The results of the suggested solution will be compared with the walk test results and the coverage simulation analysis using the planning methods. Data statistics taken from MNOs will explain the effectiveness of the solution in term of Signal quality level where the Signal to Noise Ratio (SINR) recorded at −95 dBm below the threshold of −85 dBm to prevent interference with MS users. The Resource Block (RB) Utilization shows the utilization of all sectors are at below 70% of total available capacity which means that the congestion level is manageable and MS user able to access the network without interruption. Fast deployment, less maintenance and a shared solution between MNOs is a key factor in the proposed study and is known as Hybrid Combiner Circuit of Multi Network Operator for Capacity Enhancement Solution in Indoor Environment.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132706301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development Of A Deep Learning Model To Classify X-Ray Of Covid-19, Normal And Pneumonia-Affected Patients 开发一种深度学习模型对Covid-19、正常和肺炎患者的x射线进行分类
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576804
Boon Kai Law, Lih Poh Lin
Pneumonia is commonly seen in several diseases, including Covid-19 that has put countries under lockdown today [1]. Other than antigen rapid test kit (RTK) and reverse transcription-polymerase chain reaction (RT-PCR), an alternative method to detect COVID-19 is through the examination of patients’ chest radiography (CXR). However, the results of manual inspections may be false and the misdiagnosis could lead to fatal consequences such as delayed treatment and death. The manual inspection can be inconsistent, inaccurate and may differ from different individuals due to different perspectives. Often, Covid-19 Xrays are misinterpreted as bacterial pneumonia. With the advancement of technology, this issue can be overcome by developing a Convolutional Neural Network (CNN) model to categorize X-ray of normal, pneumonia-affected and COVID-19 patients via deep learning. In this work, various CNN models (ResNet-50, ResNet-101, Vgg-16, Vgg-19 and SqueezeNet) were trained with the public databases that contain a combination of 1345 viral pneumonia, 1200 COVID-19 in addition to 1341 regular CXR images. The transfer learning method was employed, aided by image augmentation for training and validation of ResNet-50, ResNet-101, Vgg-16 and Vgg-19 architectures. Meanwhile, SqueezeNet was trained from scratch to investigate the importance of transfer learning to the model. The highest training accuracy achieved in this study was 97.38% by the VGG-16 model using a learning rate of 0.01 whereas the highest weighted average accuracy achieved was 94% by the VGG-16 model using a learning rate of 0.01 and the VGG-19 model using a learning rate of 0.001. The reliability and high accuracy of the CNN model would open a new avenue for the diagnosis of Covid-19.
肺炎常见于几种疾病,包括Covid-19,它已使各国今天处于封锁状态[1]。除了抗原快速检测试剂盒(RTK)和逆转录聚合酶链反应(RT-PCR)之外,检测COVID-19的另一种方法是通过检查患者的胸部x线片(CXR)。然而,人工检查的结果可能是错误的,误诊可能导致致命的后果,如延误治疗和死亡。人工检查可能是不一致的,不准确的,并且由于不同的角度,不同的人可能会有所不同。通常,Covid-19 x射线被误解为细菌性肺炎。随着技术的进步,可以开发卷积神经网络(CNN)模型,通过深度学习对正常患者、肺炎患者和新冠肺炎患者的x射线进行分类,从而克服这一问题。在这项工作中,各种CNN模型(ResNet-50, ResNet-101, Vgg-16, Vgg-19和SqueezeNet)使用包含1345个病毒性肺炎,1200个COVID-19以及1341个常规CXR图像的公共数据库进行训练。采用迁移学习方法,结合图像增强对ResNet-50、ResNet-101、Vgg-16和Vgg-19架构进行训练和验证。同时,我们从头开始训练SqueezeNet,以研究迁移学习对模型的重要性。在本研究中,VGG-16模型在学习率为0.01的情况下获得的最高训练准确率为97.38%,而VGG-16模型在学习率为0.01和VGG-19模型在学习率为0.001的情况下获得的最高加权平均准确率为94%。CNN模型的可靠性和高准确性将为新冠肺炎的诊断开辟新的途径。
{"title":"Development Of A Deep Learning Model To Classify X-Ray Of Covid-19, Normal And Pneumonia-Affected Patients","authors":"Boon Kai Law, Lih Poh Lin","doi":"10.1109/ICSIPA52582.2021.9576804","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576804","url":null,"abstract":"Pneumonia is commonly seen in several diseases, including Covid-19 that has put countries under lockdown today [1]. Other than antigen rapid test kit (RTK) and reverse transcription-polymerase chain reaction (RT-PCR), an alternative method to detect COVID-19 is through the examination of patients’ chest radiography (CXR). However, the results of manual inspections may be false and the misdiagnosis could lead to fatal consequences such as delayed treatment and death. The manual inspection can be inconsistent, inaccurate and may differ from different individuals due to different perspectives. Often, Covid-19 Xrays are misinterpreted as bacterial pneumonia. With the advancement of technology, this issue can be overcome by developing a Convolutional Neural Network (CNN) model to categorize X-ray of normal, pneumonia-affected and COVID-19 patients via deep learning. In this work, various CNN models (ResNet-50, ResNet-101, Vgg-16, Vgg-19 and SqueezeNet) were trained with the public databases that contain a combination of 1345 viral pneumonia, 1200 COVID-19 in addition to 1341 regular CXR images. The transfer learning method was employed, aided by image augmentation for training and validation of ResNet-50, ResNet-101, Vgg-16 and Vgg-19 architectures. Meanwhile, SqueezeNet was trained from scratch to investigate the importance of transfer learning to the model. The highest training accuracy achieved in this study was 97.38% by the VGG-16 model using a learning rate of 0.01 whereas the highest weighted average accuracy achieved was 94% by the VGG-16 model using a learning rate of 0.01 and the VGG-19 model using a learning rate of 0.001. The reliability and high accuracy of the CNN model would open a new avenue for the diagnosis of Covid-19.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133721990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1