首页 > 最新文献

2023 Intelligent Methods, Systems, and Applications (IMSA)最新文献

英文 中文
A Comparative Study of Predictive Data Mining Techniques for Customer Churn in the Banking Industry 银行业客户流失预测数据挖掘技术的比较研究
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217514
D. O. Orina, R. Rimiru, W. Mwangi
The increasing competition in the banking industry has made customer churn analysis and prediction a crucial concern. Banks must now adopt customer retention strategies while also working towards acquiring new customers to expand their market share. In today’s business environment, machine learning techniques and algorithms are crucial for banks because acquiring new customers is more expensive than retaining existing ones. This proposed research aims to compare many supervised machine learning algorithms, and based on the experimental results, suggest the best-suited model for predicting customer churn. The process involves cross-validation, balancing data using the SMOTE algorithm, and utilizing both simple machine algorithms and ensemble methods for modeling. The experiments conducted in this study revealed that the random forest model performed the best, achieving an accuracy of 88%, an area under the curve (AUC) of 0.85, and an f1-score of 0.85 when using balanced data. This result is consistent with related research considered in this paper, which has demonstrated random forest as one of the most effective algorithms for customer predictive classification issues. Feature importance analysis from the optimization models indicated that the difference between depositing and withdrawing was the most significant attribute, while the maximum deposit per product had the least significance. The data mining techniques proposed to be used in this research include Decision Tree, Neural Networks, Support Vector Machine, Logistic Regression, Random Forest, XG-Boost, Ada-Boost, and K-Nearest Neighbor.
银行业日益激烈的竞争使得客户流失分析和预测成为一个至关重要的问题。银行现在必须采取留住客户的策略,同时努力获得新客户以扩大市场份额。在当今的商业环境中,机器学习技术和算法对银行至关重要,因为获得新客户比保留现有客户更昂贵。这项拟议的研究旨在比较许多有监督的机器学习算法,并根据实验结果,提出最适合预测客户流失的模型。这个过程包括交叉验证、使用SMOTE算法平衡数据,以及利用简单的机器算法和集成方法进行建模。本研究的实验结果表明,随机森林模型在使用平衡数据时,准确率为88%,曲线下面积(AUC)为0.85,f1-score为0.85。这一结果与本文考虑的相关研究一致,该研究表明随机森林是客户预测分类问题最有效的算法之一。优化模型的特征重要性分析表明,存取款差异是最显著的属性,而每件产品最大存款额的显著性最低。本研究提出的数据挖掘技术包括决策树、神经网络、支持向量机、逻辑回归、随机森林、XG-Boost、Ada-Boost和K-Nearest Neighbor。
{"title":"A Comparative Study of Predictive Data Mining Techniques for Customer Churn in the Banking Industry","authors":"D. O. Orina, R. Rimiru, W. Mwangi","doi":"10.1109/IMSA58542.2023.10217514","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217514","url":null,"abstract":"The increasing competition in the banking industry has made customer churn analysis and prediction a crucial concern. Banks must now adopt customer retention strategies while also working towards acquiring new customers to expand their market share. In today’s business environment, machine learning techniques and algorithms are crucial for banks because acquiring new customers is more expensive than retaining existing ones. This proposed research aims to compare many supervised machine learning algorithms, and based on the experimental results, suggest the best-suited model for predicting customer churn. The process involves cross-validation, balancing data using the SMOTE algorithm, and utilizing both simple machine algorithms and ensemble methods for modeling. The experiments conducted in this study revealed that the random forest model performed the best, achieving an accuracy of 88%, an area under the curve (AUC) of 0.85, and an f1-score of 0.85 when using balanced data. This result is consistent with related research considered in this paper, which has demonstrated random forest as one of the most effective algorithms for customer predictive classification issues. Feature importance analysis from the optimization models indicated that the difference between depositing and withdrawing was the most significant attribute, while the maximum deposit per product had the least significance. The data mining techniques proposed to be used in this research include Decision Tree, Neural Networks, Support Vector Machine, Logistic Regression, Random Forest, XG-Boost, Ada-Boost, and K-Nearest Neighbor.","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114179686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparative Study of Weight Initialization Techniques for Convolutional Neural Networks in COVID-19 Classification from X-ray Images. 卷积神经网络权重初始化技术在x射线图像COVID-19分类中的比较研究。
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217655
Abdelrahman Ezzeldin Nagib, M. Saeed, Shereen Fathy El-Feky, Ali Khater Mohamed
The rapid spread of the COVID-19 pandemic has created a pressing urgent need for accurate and efficient diagnostic tools. Recently, Convolutional neural networks (CNN) have shown great potential in classifying COVID-19 infected cases from X-ray images, but the choice of weight initialization technique plays a crucial role in their performance of the Convolutional neural networks. In this research Paper, comparative study of different weight initialization techniques COVID-19 in the context of COVID-19 classification, performance evaluation techniques have been implemented such as Glorot, Orthogonal, and Random Uniform and results shows that that the Random Uniform initialization technique outperforms other weight initialization techniques in terms of overall classification accuracy. Keywords: COVID-19, Convolutional Neural Networks, X-ray images, Weight initialization, Classification
COVID-19大流行的迅速蔓延使人们迫切需要准确、高效的诊断工具。近年来,卷积神经网络(CNN)在从x射线图像中对COVID-19感染病例进行分类方面显示出巨大的潜力,但权重初始化技术的选择对卷积神经网络的性能起着至关重要的作用。本文在COVID-19分类的背景下,对不同权重初始化技术COVID-19进行了对比研究,实现了Glorot、Orthogonal、Random Uniform等性能评价技术,结果表明Random Uniform初始化技术在整体分类精度上优于其他权重初始化技术。关键词:COVID-19,卷积神经网络,x射线图像,权值初始化,分类
{"title":"A Comparative Study of Weight Initialization Techniques for Convolutional Neural Networks in COVID-19 Classification from X-ray Images.","authors":"Abdelrahman Ezzeldin Nagib, M. Saeed, Shereen Fathy El-Feky, Ali Khater Mohamed","doi":"10.1109/IMSA58542.2023.10217655","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217655","url":null,"abstract":"The rapid spread of the COVID-19 pandemic has created a pressing urgent need for accurate and efficient diagnostic tools. Recently, Convolutional neural networks (CNN) have shown great potential in classifying COVID-19 infected cases from X-ray images, but the choice of weight initialization technique plays a crucial role in their performance of the Convolutional neural networks. In this research Paper, comparative study of different weight initialization techniques COVID-19 in the context of COVID-19 classification, performance evaluation techniques have been implemented such as Glorot, Orthogonal, and Random Uniform and results shows that that the Random Uniform initialization technique outperforms other weight initialization techniques in terms of overall classification accuracy. Keywords: COVID-19, Convolutional Neural Networks, X-ray images, Weight initialization, Classification","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114388312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distance-Based Meta-Features for Arabic Text Classification 基于距离的阿拉伯语文本分类元特征
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217428
Maroua Louail, Chafia Kara-Mohamed alias Hamdi-Cherif
Text classification (TC) is the process by which the computer has the ability to provide a label to a given text based on its content. Term Frequency–Inverse Document Frequency (TF-IDF) is one of the popular methods used for Arabic text representation. The high number of dimensions and sparseness are among the main issues faced by the TF-IDF method, requiring large space storage, high computational costs and the risk of overfitting. In this paper, we focus on four distance-based meta-features: CosKNN, L2KNN, CosCent and L2Cent derived from the TF-IDF representations, as a dimensionality reduction method for Arabic text classification. Four well-known classifiers are used in the present work: K-Nearest Neighbors, Logistic Regression, Support Vector Machines and Random Forest to evaluate the impact of these distance-based meta-features on the classification performance. The obtained results prove that the proposed dimensionality reduction method improves the classification accuracy in 50% of the cases and speed up the training phase (between 8x and 1764x faster) when compared to the original TF-IDF. As far as we know, distance-based meta-features are used for Arabic text classification for the first time.
文本分类(TC)是计算机能够根据给定文本的内容为其提供标签的过程。术语频率-逆文档频率(TF-IDF)是用于阿拉伯语文本表示的常用方法之一。高维数和稀疏性是TF-IDF方法面临的主要问题,需要大空间存储、高计算成本和过拟合风险。在本文中,我们重点研究了四个基于距离的元特征:CosKNN、L2KNN、CosCent和L2Cent,这些元特征来自TF-IDF表示,作为阿拉伯文本分类的降维方法。在本工作中使用了四种著名的分类器:k近邻、逻辑回归、支持向量机和随机森林来评估这些基于距离的元特征对分类性能的影响。得到的结果证明,与原始TF-IDF相比,本文提出的降维方法提高了50%的分类准确率,并加快了训练阶段的速度(提高了8倍到1764倍)。据我们所知,基于距离的元特征是第一次用于阿拉伯语文本分类。
{"title":"Distance-Based Meta-Features for Arabic Text Classification","authors":"Maroua Louail, Chafia Kara-Mohamed alias Hamdi-Cherif","doi":"10.1109/IMSA58542.2023.10217428","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217428","url":null,"abstract":"Text classification (TC) is the process by which the computer has the ability to provide a label to a given text based on its content. Term Frequency–Inverse Document Frequency (TF-IDF) is one of the popular methods used for Arabic text representation. The high number of dimensions and sparseness are among the main issues faced by the TF-IDF method, requiring large space storage, high computational costs and the risk of overfitting. In this paper, we focus on four distance-based meta-features: CosKNN, L2KNN, CosCent and L2Cent derived from the TF-IDF representations, as a dimensionality reduction method for Arabic text classification. Four well-known classifiers are used in the present work: K-Nearest Neighbors, Logistic Regression, Support Vector Machines and Random Forest to evaluate the impact of these distance-based meta-features on the classification performance. The obtained results prove that the proposed dimensionality reduction method improves the classification accuracy in 50% of the cases and speed up the training phase (between 8x and 1764x faster) when compared to the original TF-IDF. As far as we know, distance-based meta-features are used for Arabic text classification for the first time.","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117073884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning model for predicting pancreatic fistula after pancreatoduodenectomy 预测胰十二指肠切除术后胰瘘的机器学习模型
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217619
Hager Saleh, Nora El-Rashidy, Eman Mohamed, Ahmad M Sultan, Ayman El Nakeeb, Shaker El-Sappagh
Pancreaticoduodenectomy (PD) is a complex surgery used mainly to treat tumors and other pancreas disorders. PD is considered one of the most challenging surgeries because it may have several complications, including bleeding and infections in the surgical area, temporary or permanent diabetes, and pancreatic leakage (PL), which may lead to morbidity and mortality. In this study, we build an accurate and medically oriented machine learning model that predicts PL after PD based on patient markers collected only before the PD operation. The study is made using a real-world dataset for 397 Egyptian patients. The proposed machine learning pipeline starts with a data preprocessing step that handles the missing data values by the median values. In the next step, diverse interpretable classifiers, including logistic regression, random forest, decision tree, support vector machine, XGBoost, and AdaBoost, are utilized to predict the PL. Hyperparameter optimization is done using grid search with k-fold cross-validation. The results indicate that XGBoost achieves the highest marks, outperforming the state-of-the-art techniques in several evaluation metrics (i.e., accuracy= 91%, precision= 90.96%, recall= 91.0%, F1-score= 90.97%, and AUC=89.78%. The resulting model is accurate enough to be medically relevant for PL prediction in real healthcare settings.
胰十二指肠切除术(PD)是一项复杂的手术,主要用于治疗肿瘤和其他胰腺疾病。PD被认为是最具挑战性的手术之一,因为它可能有几种并发症,包括手术区域出血和感染,暂时性或永久性糖尿病,以及可能导致发病率和死亡率的胰漏(PL)。在这项研究中,我们建立了一个准确的、以医学为导向的机器学习模型,该模型基于PD手术前收集的患者标记物来预测PD后的PL。这项研究使用了397名埃及患者的真实数据集。提出的机器学习管道从数据预处理步骤开始,该步骤通过中位数处理缺失的数据值。下一步,使用多种可解释分类器,包括逻辑回归、随机森林、决策树、支持向量机、XGBoost和AdaBoost来预测PL。超参数优化使用k-fold交叉验证的网格搜索完成。结果表明,XGBoost获得了最高分,在几个评估指标(即准确率= 91%,精密度= 90.96%,召回率= 91.0%,F1-score= 90.97%, AUC=89.78%)上优于最先进的技术。所得到的模型足够准确,可以用于实际医疗保健环境中的PL预测。
{"title":"Machine learning model for predicting pancreatic fistula after pancreatoduodenectomy","authors":"Hager Saleh, Nora El-Rashidy, Eman Mohamed, Ahmad M Sultan, Ayman El Nakeeb, Shaker El-Sappagh","doi":"10.1109/IMSA58542.2023.10217619","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217619","url":null,"abstract":"Pancreaticoduodenectomy (PD) is a complex surgery used mainly to treat tumors and other pancreas disorders. PD is considered one of the most challenging surgeries because it may have several complications, including bleeding and infections in the surgical area, temporary or permanent diabetes, and pancreatic leakage (PL), which may lead to morbidity and mortality. In this study, we build an accurate and medically oriented machine learning model that predicts PL after PD based on patient markers collected only before the PD operation. The study is made using a real-world dataset for 397 Egyptian patients. The proposed machine learning pipeline starts with a data preprocessing step that handles the missing data values by the median values. In the next step, diverse interpretable classifiers, including logistic regression, random forest, decision tree, support vector machine, XGBoost, and AdaBoost, are utilized to predict the PL. Hyperparameter optimization is done using grid search with k-fold cross-validation. The results indicate that XGBoost achieves the highest marks, outperforming the state-of-the-art techniques in several evaluation metrics (i.e., accuracy= 91%, precision= 90.96%, recall= 91.0%, F1-score= 90.97%, and AUC=89.78%. The resulting model is accurate enough to be medically relevant for PL prediction in real healthcare settings.","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129414101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session (6) 会话(6)
Pub Date : 2023-07-15 DOI: 10.1109/imsa58542.2023.10217770
{"title":"Session (6)","authors":"","doi":"10.1109/imsa58542.2023.10217770","DOIUrl":"https://doi.org/10.1109/imsa58542.2023.10217770","url":null,"abstract":"","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128702165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Papers Statistics 论文的统计数据
Pub Date : 2023-07-15 DOI: 10.1109/imsa58542.2023.10217513
{"title":"Papers Statistics","authors":"","doi":"10.1109/imsa58542.2023.10217513","DOIUrl":"https://doi.org/10.1109/imsa58542.2023.10217513","url":null,"abstract":"","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124527390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparative Analysis of Deep Learning Models for Brain Tumor Segmentation 深度学习模型在脑肿瘤分割中的比较分析
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217767
Maha AbdElwareth, Mariem Abdou, Michael Adel, Alaa Hatem, Login Darwish, Remon Mamdouh, Sahar Selim
A brain tumor is an extremely hazardous illness that can affect people of any age. Less than 50% of individuals with brain cancer have a chance of surviving. As a result, precise segmentation of brain tumors is crucial for the diagnosis, planning of the course of treatment, and tracking of the tumor growth. Deep Learning (DL) models can increase the precision and speed of brain tumor diagnosis by precisely segmenting and identifying tumor locations in medical pictures. In this study, we compare four DL models for segmenting brain tumors, the 3D U-Net, the Attention Res U-Net, the U-Net++, and the U-Net Transformer (UNETR). We used 485 MRI (Magnetic Resonance Imaging) scans from the BraTS 2018 dataset, which include annotated ground truth tumor segmentations. We carried out preprocessing operations such as label merging, cropping, and z-score normalization. We evaluated the performance of two models using the dice coefficient metric. Our findings demonstrated that the Attention Res U-Net has a higher segmentation accuracy than the other three U-Net models, with a testing dice coefficient of 0.79 against 0.78, 0.77, 0.72 for the 3D U-net, UNETR, and U-net++ respectively. The results point to the Attention Res U-Net as a potentially useful method for brain tumor segmentation tasks.
脑肿瘤是一种极其危险的疾病,可以影响任何年龄的人。只有不到50%的脑癌患者有机会存活。因此,脑肿瘤的精确分割对于诊断、规划治疗过程和跟踪肿瘤生长至关重要。深度学习(DL)模型可以通过对医学图像中肿瘤位置的精确分割和识别,提高脑肿瘤诊断的精度和速度。在这项研究中,我们比较了四种用于脑肿瘤分割的深度学习模型:3D U-Net、注意力Res U-Net、U-Net++和U-Net Transformer (UNETR)。我们使用了来自BraTS 2018数据集的485个MRI(磁共振成像)扫描,其中包括带注释的ground truth肿瘤分割。我们进行了预处理操作,如标签合并、裁剪和z-score归一化。我们使用骰子系数度量来评估两个模型的性能。我们的研究结果表明,Attention Res U-Net的分割精度高于其他三种U-Net模型,其测试骰子系数为0.79,而3D U-Net、UNETR和U-Net ++的测试骰子系数分别为0.78、0.77和0.72。研究结果表明,Attention Res U-Net是一种潜在的有用的脑肿瘤分割方法。
{"title":"A Comparative Analysis of Deep Learning Models for Brain Tumor Segmentation","authors":"Maha AbdElwareth, Mariem Abdou, Michael Adel, Alaa Hatem, Login Darwish, Remon Mamdouh, Sahar Selim","doi":"10.1109/IMSA58542.2023.10217767","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217767","url":null,"abstract":"A brain tumor is an extremely hazardous illness that can affect people of any age. Less than 50% of individuals with brain cancer have a chance of surviving. As a result, precise segmentation of brain tumors is crucial for the diagnosis, planning of the course of treatment, and tracking of the tumor growth. Deep Learning (DL) models can increase the precision and speed of brain tumor diagnosis by precisely segmenting and identifying tumor locations in medical pictures. In this study, we compare four DL models for segmenting brain tumors, the 3D U-Net, the Attention Res U-Net, the U-Net++, and the U-Net Transformer (UNETR). We used 485 MRI (Magnetic Resonance Imaging) scans from the BraTS 2018 dataset, which include annotated ground truth tumor segmentations. We carried out preprocessing operations such as label merging, cropping, and z-score normalization. We evaluated the performance of two models using the dice coefficient metric. Our findings demonstrated that the Attention Res U-Net has a higher segmentation accuracy than the other three U-Net models, with a testing dice coefficient of 0.79 against 0.78, 0.77, 0.72 for the 3D U-net, UNETR, and U-net++ respectively. The results point to the Attention Res U-Net as a potentially useful method for brain tumor segmentation tasks.","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127640555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Named Entity Recognition for Arabic Medical Texts Using Deep Learning Models 使用深度学习模型的阿拉伯医学文本命名实体识别
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217658
Hamada Nayel, Nourhan Marzouk, Ahmed N. Elsawy
Named Entity Recognition (NER) plays a vital role in extracting meaningful information from textual data in the medical domain. This paper focuses on NER for Arabic medical texts, specifically targeting the recognition of disease entities. The study presents a comparative analysis of deep learning techniques, including Conditional Random Fields (CRF), Long Short-Term Memory (LSTM), LSTM-CRF, and Bidirectional LSTM (BiLSTM), applied to a dataset comprising Arabic medical texts related to diseases. The dataset is meticulously annotated, ensuring accurate labelling of disease entities for training and evaluation purposes. The models are trained and evaluated using appropriate loss functions and evaluation metrics, such as precision, recall, and F1-score. Comparative experiments are conducted to assess the performance of each model on the disease dataset. The results demonstrate the effectiveness of deep learning techniques for NER in Arabic medical texts, with the LSTM-CRF and BiLSTM-CRF models outperforming the standalone CRF and LSTM models. LSTM-CRF and BiLSTM-CRF models reported F1-score of 0.97 and 0.94. These hybrid models achieve higher precision, recall, and F1-score, showcasing their ability to accurately identify disease entities in Arabic medical texts. The findings of this study contribute to the advancement of NER techniques for Arabic medical texts, focusing on disease entities. The comparative analysis of CRF, LSTM, LSTM-CRF, and BiLSTM models provides valuable insights into their respective strengths and limitations of NER for Arabic medical texts. These insights can guide the selection and implementation of appropriate models for disease entity recognition in Arabic medical texts, facilitating accurate information extraction and analysis in the medical domain.
命名实体识别(NER)在医学领域从文本数据中提取有意义的信息方面起着至关重要的作用。本文主要研究阿拉伯医学文本的NER,针对疾病实体的识别。该研究对深度学习技术进行了比较分析,包括条件随机场(CRF)、长短期记忆(LSTM)、LSTM-CRF和双向LSTM (BiLSTM),这些技术应用于包含与疾病相关的阿拉伯医学文本的数据集。数据集经过精心注释,确保疾病实体的准确标记,用于培训和评估目的。使用适当的损失函数和评估指标(如精度、召回率和f1分数)对模型进行训练和评估。通过对比实验来评估每个模型在疾病数据集上的性能。结果表明,深度学习技术对阿拉伯医学文本中的NER具有有效性,LSTM-CRF和BiLSTM-CRF模型优于独立的CRF和LSTM模型。LSTM-CRF和BiLSTM-CRF模型的f1评分分别为0.97和0.94。这些混合模型实现了更高的精度、召回率和f1分,展示了它们在阿拉伯医学文本中准确识别疾病实体的能力。本研究的发现有助于阿拉伯医学文本的NER技术的进步,重点是疾病实体。通过对CRF、LSTM、LSTM-CRF和BiLSTM模型的比较分析,可以深入了解阿拉伯医学文本NER各自的优势和局限性。这些见解可以指导阿拉伯医学文本中疾病实体识别的适当模型的选择和实施,促进医学领域准确的信息提取和分析。
{"title":"Named Entity Recognition for Arabic Medical Texts Using Deep Learning Models","authors":"Hamada Nayel, Nourhan Marzouk, Ahmed N. Elsawy","doi":"10.1109/IMSA58542.2023.10217658","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217658","url":null,"abstract":"Named Entity Recognition (NER) plays a vital role in extracting meaningful information from textual data in the medical domain. This paper focuses on NER for Arabic medical texts, specifically targeting the recognition of disease entities. The study presents a comparative analysis of deep learning techniques, including Conditional Random Fields (CRF), Long Short-Term Memory (LSTM), LSTM-CRF, and Bidirectional LSTM (BiLSTM), applied to a dataset comprising Arabic medical texts related to diseases. The dataset is meticulously annotated, ensuring accurate labelling of disease entities for training and evaluation purposes. The models are trained and evaluated using appropriate loss functions and evaluation metrics, such as precision, recall, and F1-score. Comparative experiments are conducted to assess the performance of each model on the disease dataset. The results demonstrate the effectiveness of deep learning techniques for NER in Arabic medical texts, with the LSTM-CRF and BiLSTM-CRF models outperforming the standalone CRF and LSTM models. LSTM-CRF and BiLSTM-CRF models reported F1-score of 0.97 and 0.94. These hybrid models achieve higher precision, recall, and F1-score, showcasing their ability to accurately identify disease entities in Arabic medical texts. The findings of this study contribute to the advancement of NER techniques for Arabic medical texts, focusing on disease entities. The comparative analysis of CRF, LSTM, LSTM-CRF, and BiLSTM models provides valuable insights into their respective strengths and limitations of NER for Arabic medical texts. These insights can guide the selection and implementation of appropriate models for disease entity recognition in Arabic medical texts, facilitating accurate information extraction and analysis in the medical domain.","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127691492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Biometric Authentication Using Face Thermal Images Based on Neural Fuzzy Extractor 基于神经模糊提取的人脸热图像生物特征认证
Pub Date : 2023-07-15 DOI: 10.1109/IMSA58542.2023.10217752
A. Sulavko, I. Panfilova, A. Samotuga, Samal Zhumazanova
A method of biometric authentication based on the thermogram of the subject's face was proposed. This method allows you to associate a biometric image of a person with a cryptographic key or password, as well as protect the biometric image and key (password) from being compromised during storage and transmission over communication channels. This effect was achieved through the use of a fuzzy neural extractor trained according to the GOST R 52633.5 standard. The solution also uses a deep convolutional neural network for face detection and an Inception-Resnet network for feature embedding. RetinaFace, ResNet50 and VGG-Face were tested as alternatives to these neural network models. The best result achieved was EER = 4.91
提出了一种基于人脸热像图的生物特征认证方法。此方法允许您将人的生物特征图像与加密密钥或密码关联,并保护生物特征图像和密钥(密码)在存储和通过通信通道传输期间不被泄露。这种效果是通过使用根据GOST R 52633.5标准训练的模糊神经提取器来实现的。该解决方案还使用深度卷积神经网络进行人脸检测,并使用Inception-Resnet网络进行特征嵌入。我们测试了RetinaFace、ResNet50和VGG-Face作为这些神经网络模型的替代品。最佳结果为EER = 4.91
{"title":"Biometric Authentication Using Face Thermal Images Based on Neural Fuzzy Extractor","authors":"A. Sulavko, I. Panfilova, A. Samotuga, Samal Zhumazanova","doi":"10.1109/IMSA58542.2023.10217752","DOIUrl":"https://doi.org/10.1109/IMSA58542.2023.10217752","url":null,"abstract":"A method of biometric authentication based on the thermogram of the subject's face was proposed. This method allows you to associate a biometric image of a person with a cryptographic key or password, as well as protect the biometric image and key (password) from being compromised during storage and transmission over communication channels. This effect was achieved through the use of a fuzzy neural extractor trained according to the GOST R 52633.5 standard. The solution also uses a deep convolutional neural network for face detection and an Inception-Resnet network for feature embedding. RetinaFace, ResNet50 and VGG-Face were tested as alternatives to these neural network models. The best result achieved was EER = 4.91","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132144700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End of Session (14) 会议结束(14)
Pub Date : 2023-07-15 DOI: 10.1109/imsa58542.2023.10217657
{"title":"End of Session (14)","authors":"","doi":"10.1109/imsa58542.2023.10217657","DOIUrl":"https://doi.org/10.1109/imsa58542.2023.10217657","url":null,"abstract":"","PeriodicalId":110239,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications (IMSA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129798440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2023 Intelligent Methods, Systems, and Applications (IMSA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1