International Journal of Intelligent Computing and Information Sciences最新文献_第7页

machine and deep learning approaches for human activity recognition 人类活动识别的机器和深度学习方法

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-09-21 DOI: 10.21608/ijicis.2021.82008.1106

maha alhumayani, Mahmoud Monir, R. Ismail

Human Activity Recognition (HAR) is a domain that has shown great interest in the past years and tills now. The main cause for this is that it can be used in various applications. There exist several devices and sensors that can capture and record activities. In this paper, a survey about the machine learning and deep learning methodologies in HAR is provided with information about the data, filtering methods, feature extraction methods, classification, and different performance measurements. The main aim is to target the old and the recent papers published in HAR and to determine whether the machine learning or deep learning methods is better in performance. In addition to this, the survey will cover the types of actions or activities that are predicted. Then, a discussion about the main points obtained from the survey. Finally, the conclusions, limitations, and challenges in HAR are presented clearly. Human activity recognition (HAR) can be known with various types of definitions. HAR is preserved to be a field of studying and identifying the movements of the individuals or the action of the human based on sensor data . These movements can be different activities such as walking, talking, standing, and sitting. They are also called indoor activities.

人类活动识别(Human Activity Recognition, HAR)是近年来研究热点之一。其主要原因是它可以用于各种应用程序。有几种设备和传感器可以捕捉和记录活动。在本文中，对HAR中的机器学习和深度学习方法进行了调查，提供了有关数据，过滤方法，特征提取方法，分类和不同性能测量的信息。主要目的是针对HAR上发表的旧的和最近的论文，并确定机器学习或深度学习方法是否在性能上更好。除此之外，调查还将涵盖预测的行动或活动类型。然后，讨论了从调查中得出的主要观点。最后，明确提出了HAR的结论、局限性和挑战。人类活动识别(HAR)可以通过各种类型的定义来了解。HAR被保留为研究和识别个人运动或基于传感器数据的人类行为的领域。这些动作可以是不同的活动，如走路、说话、站着和坐着。它们也被称为室内活动。

{"title":"machine and deep learning approaches for human activity recognition","authors":"maha alhumayani, Mahmoud Monir, R. Ismail","doi":"10.21608/ijicis.2021.82008.1106","DOIUrl":"https://doi.org/10.21608/ijicis.2021.82008.1106","url":null,"abstract":"Human Activity Recognition (HAR) is a domain that has shown great interest in the past years and tills now. The main cause for this is that it can be used in various applications. There exist several devices and sensors that can capture and record activities. In this paper, a survey about the machine learning and deep learning methodologies in HAR is provided with information about the data, filtering methods, feature extraction methods, classification, and different performance measurements. The main aim is to target the old and the recent papers published in HAR and to determine whether the machine learning or deep learning methods is better in performance. In addition to this, the survey will cover the types of actions or activities that are predicted. Then, a discussion about the main points obtained from the survey. Finally, the conclusions, limitations, and challenges in HAR are presented clearly. Human activity recognition (HAR) can be known with various types of definitions. HAR is preserved to be a field of studying and identifying the movements of the individuals or the action of the human based on sensor data . These movements can be different activities such as walking, talking, standing, and sitting. They are also called indoor activities.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114195351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Comparison of Satellite Images Classification Techniques using Landsat-8 Data for Land Cover Extraction 利用Landsat-8数据提取土地覆盖的卫星图像分类技术比较

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-09-16 DOI: 10.21608/ijicis.2021.78853.1098

Soha Ahmed

Accurate extraction of land cover types from thematic maps using satellite images still constitutes a critical challenge. The selection of a suitable satellite image classification algorithm is considered a crucial prerequisite for successful classification results that are required for various applications. The optimal classification algorithm is considered a significant key for improving classification accuracy. The principal foci of this study were to compare, analyze the performance, and assess the effectiveness of four classification algorithms including ISODATA, K-means, pixel-based and segment-based classification techniques to attain accurate land cover extraction from remote sensing data. The classified images were validated with ground control points obtained from field visits in addition to the DigitalGlobe and Google Earth Pro. The overall accuracy of the ISODATA, K-means, pixel, and segment-based classifications were 81.82%, 77.27%, 92.42%, and 87.88%, respectively. The results revealed that the pixel-based classification presented a superior in terms of the overall accuracy and kappa coefficient.

利用卫星图像从专题地图中准确提取土地覆盖类型仍然是一项重大挑战。选择合适的卫星图像分类算法被认为是各种应用所需的成功分类结果的关键前提。最优分类算法被认为是提高分类精度的重要关键。研究了ISODATA、K-means、基于像元(pixel)和基于段(segment)的四种分类技术在土地覆盖遥感数据提取中的应用，并对其性能和有效性进行了比较和分析。除了DigitalGlobe和Google Earth Pro之外，还通过实地考察获得的地面控制点对分类图像进行了验证。基于ISODATA、K-means、pixel和segment的分类总体准确率分别为81.82%、77.27%、92.42%和87.88%。结果表明，基于像素的分类方法在总体准确率和kappa系数上都有较好的表现。

{"title":"Comparison of Satellite Images Classification Techniques using Landsat-8 Data for Land Cover Extraction","authors":"Soha Ahmed","doi":"10.21608/ijicis.2021.78853.1098","DOIUrl":"https://doi.org/10.21608/ijicis.2021.78853.1098","url":null,"abstract":"Accurate extraction of land cover types from thematic maps using satellite images still constitutes a critical challenge. The selection of a suitable satellite image classification algorithm is considered a crucial prerequisite for successful classification results that are required for various applications. The optimal classification algorithm is considered a significant key for improving classification accuracy. The principal foci of this study were to compare, analyze the performance, and assess the effectiveness of four classification algorithms including ISODATA, K-means, pixel-based and segment-based classification techniques to attain accurate land cover extraction from remote sensing data. The classified images were validated with ground control points obtained from field visits in addition to the DigitalGlobe and Google Earth Pro. The overall accuracy of the ISODATA, K-means, pixel, and segment-based classifications were 81.82%, 77.27%, 92.42%, and 87.88%, respectively. The results revealed that the pixel-based classification presented a superior in terms of the overall accuracy and kappa coefficient.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125208085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

ENHANCED PIXEL BASED URBAN AREA CLASSIFICATION OF SATELLITE IMAGES USING CONVOLUTIONAL NEURAL NETWORK 基于卷积神经网络的基于像素的卫星图像城市区域分类

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-09-15 DOI: 10.21608/ijicis.2021.79070.1099

Noureldin Laban, B. Abdellatif, Hala Moushier, Howida A. Shedeed, M. Tolba

Recent years have witnessed a great development in the use of deep learning in the applied fields in general, including the improvement of remote sensing. Satellite imagery classification has played a prominent role in various development processes. This paper presents a new improvement in automatic urban classification using One Dimension Convolutional Neural Network (1DCNN) architecture. The suggested approach has three enhancement processes. First, select training boxes for different classes and create many pixels with variable class signatures. This makes the training process dependent on the broadband of signature for the classes. Second, modified 1D convolution was used to re-encode pixel values to increase distinguish power. Third, adding a new median filter layer at the end of network architecture to remove pixels like noise to make the resulting map smoother. An image of Greater Cairo is used and the different urban classes are defined within it. The proposed method was compared to other methods based on pixels. The proposed method proved to be numerically and visually superior.

近年来，深度学习在应用领域的应用有了很大的发展，包括遥感技术的改进。卫星图像分类在各种发展过程中发挥了突出的作用。本文提出了一种新的基于一维卷积神经网络(1DCNN)的城市自动分类方法。建议的方法有三个增强过程。首先，为不同的类选择训练框，并创建许多具有可变类签名的像素。这使得训练过程依赖于类的签名宽带。其次，采用改进的一维卷积对像素值进行重新编码，提高分辨能力;第三，在网络结构的末端添加一个新的中值过滤层，去除像噪声这样的像素，使最终的地图更平滑。使用了大开罗的图像，并在其中定义了不同的城市阶级。将该方法与其他基于像素的方法进行了比较。所提出的方法在数值和视觉上都具有优越性。

{"title":"ENHANCED PIXEL BASED URBAN AREA CLASSIFICATION OF SATELLITE IMAGES USING CONVOLUTIONAL NEURAL NETWORK","authors":"Noureldin Laban, B. Abdellatif, Hala Moushier, Howida A. Shedeed, M. Tolba","doi":"10.21608/ijicis.2021.79070.1099","DOIUrl":"https://doi.org/10.21608/ijicis.2021.79070.1099","url":null,"abstract":"Recent years have witnessed a great development in the use of deep learning in the applied fields in general, including the improvement of remote sensing. Satellite imagery classification has played a prominent role in various development processes. This paper presents a new improvement in automatic urban classification using One Dimension Convolutional Neural Network (1DCNN) architecture. The suggested approach has three enhancement processes. First, select training boxes for different classes and create many pixels with variable class signatures. This makes the training process dependent on the broadband of signature for the classes. Second, modified 1D convolution was used to re-encode pixel values to increase distinguish power. Third, adding a new median filter layer at the end of network architecture to remove pixels like noise to make the resulting map smoother. An image of Greater Cairo is used and the different urban classes are defined within it. The proposed method was compared to other methods based on pixels. The proposed method proved to be numerically and visually superior.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124827874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

ONTOLOGY-BASED APPROACH FOR FEATURE LEVEL SENTIMENT ANALYSIS 基于本体的特征级情感分析方法

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-09-04 DOI: 10.21608/ijicis.2021.77345.1094

Eman M. Aboelela, Walaa K. Gad, R. Ismail

: Through the state-of-the-art digitalization, we can see a massive growth in user-generated content on the web that provides feedback from people on a variety of topics. However, manually managing large-scale user feedback would be a difficult task and a waste of time. Therefore, the concept of sentiment analysis is emerged. Sentiment analysis is a computerized study of individuals' feelings and opinions about an entity or product. It can be executed at three different levels: document level, sentence or phrase level, and feature level. This paper proposes a novel ontology-based approach for feature level sentiment analysis. The proposed approach extracts the product features using semantic similarity and Wordnet ontology and uses the SentiWordent dictionary to classify the users’ comments as positive and negative. Furthermore, it manages negative words to obtain more precise classification results. The proposed approach is assessed by using two benchmark amazon products’ datasets in terms of accuracy; recall, precision, and f-measure. The performance reaches to 92.4% accuracy, 97.2% precision, 92.8 % recall, and 94.4% f-measure.

通过最先进的数字化技术，我们可以看到网络上用户生成内容的巨大增长，这些内容提供了人们对各种主题的反馈。然而，手动管理大规模用户反馈将是一项艰巨的任务，并且浪费时间。因此，情感分析的概念应运而生。情感分析是一种计算机化的个人对实体或产品的感受和观点的研究。它可以在三个不同的级别上执行:文档级别、句子或短语级别和功能级别。提出了一种基于本体的特征级情感分析方法。该方法利用语义相似度和Wordnet本体提取产品特征，并使用SentiWordent字典对用户评论进行正面和负面分类。此外，它还对否定词进行管理，以获得更精确的分类结果。通过使用两个基准的亚马逊产品数据集来评估所提出的方法的准确性;召回率、精确度和f测量值。准确率为92.4%，精密度为97.2%，召回率为92.8%，f-measure为94.4%。

{"title":"ONTOLOGY-BASED APPROACH FOR FEATURE LEVEL SENTIMENT ANALYSIS","authors":"Eman M. Aboelela, Walaa K. Gad, R. Ismail","doi":"10.21608/ijicis.2021.77345.1094","DOIUrl":"https://doi.org/10.21608/ijicis.2021.77345.1094","url":null,"abstract":": Through the state-of-the-art digitalization, we can see a massive growth in user-generated content on the web that provides feedback from people on a variety of topics. However, manually managing large-scale user feedback would be a difficult task and a waste of time. Therefore, the concept of sentiment analysis is emerged. Sentiment analysis is a computerized study of individuals' feelings and opinions about an entity or product. It can be executed at three different levels: document level, sentence or phrase level, and feature level. This paper proposes a novel ontology-based approach for feature level sentiment analysis. The proposed approach extracts the product features using semantic similarity and Wordnet ontology and uses the SentiWordent dictionary to classify the users’ comments as positive and negative. Furthermore, it manages negative words to obtain more precise classification results. The proposed approach is assessed by using two benchmark amazon products’ datasets in terms of accuracy; recall, precision, and f-measure. The performance reaches to 92.4% accuracy, 97.2% precision, 92.8 % recall, and 94.4% f-measure.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116031416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

INTELLIGENT sYSTEM FOR HUMAN AUTHENTICATION USING FUSION OF DORSAL HAND, PALM AND FINGER VEINS 基于手背、手掌和手指静脉融合的智能人体认证系统

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-31 DOI: 10.21608/ijicis.2021.73726.1087

Mona A. Ahmed, A. Salem

Multimodal biometric systems roughly used to achieve extreme recognition accuracy. This paper reports a novel multimodal biometric system employing intelligent technique to authenticate human by fusion of dorsal hand, palm and finger veins pattern. We improved an image analysis technique to separate region of interest (ROI) from dorsal hand, palm and finger veins image. After separating ROI we construct a sequence of preprocessing steps to enhance dorsal hand, palm and finger veins images using Median filter, Wiener filter, Contrast Limited Adaptive Histogram Equalization (CLAHE) and Homomorphic filter to improve vein image. Our intelligent technique is based on the following intelligent algorithms, namely; principal component analysis (PCA) algorithm for feature extraction and k-Nearest Neighbors (K-NN) classifier for matching operation. The database selected was Bosphorus Hand Vein Database, CASIA Multi-Spectral Palmprint Image Database V1.0 (CASIA database) and the Shandong University Machine Learning and Applications Homologous Multi-modal Traits (SDUMLA-HMT). The accomplished result for the fusion of three biometric traits was Correct Recognition Rate (CRR) is 99.21% with False Reject Rate (FRR) 0.04%.

多模态生物识别系统大致用于达到极高的识别精度。本文报道了一种新的多模态生物识别系统，该系统采用智能技术，通过融合手背、手掌和手指的静脉图案来进行人体身份验证。我们改进了一种图像分析技术，从手背、手掌和手指静脉图像中分离出感兴趣区域。在分离ROI后，我们构建了一系列预处理步骤，利用中值滤波、维纳滤波、对比度有限自适应直方图均衡化(CLAHE)和同态滤波对手背、手掌和手指静脉图像进行增强。我们的智能技术基于以下智能算法，即;主成分分析(PCA)算法进行特征提取，k-近邻(K-NN)分类器进行匹配操作。数据库选择博斯普鲁斯手静脉数据库、CASIA多光谱掌纹图像数据库V1.0 (CASIA数据库)和山东大学机器学习与应用同源多模态特征数据库(SDUMLA-HMT)。3种生物特征融合的最终结果为正确识别率(CRR)为99.21%，错误拒绝率(FRR)为0.04%。

{"title":"INTELLIGENT sYSTEM FOR HUMAN AUTHENTICATION USING FUSION OF DORSAL HAND, PALM AND FINGER VEINS","authors":"Mona A. Ahmed, A. Salem","doi":"10.21608/ijicis.2021.73726.1087","DOIUrl":"https://doi.org/10.21608/ijicis.2021.73726.1087","url":null,"abstract":"Multimodal biometric systems roughly used to achieve extreme recognition accuracy. This paper reports a novel multimodal biometric system employing intelligent technique to authenticate human by fusion of dorsal hand, palm and finger veins pattern. We improved an image analysis technique to separate region of interest (ROI) from dorsal hand, palm and finger veins image. After separating ROI we construct a sequence of preprocessing steps to enhance dorsal hand, palm and finger veins images using Median filter, Wiener filter, Contrast Limited Adaptive Histogram Equalization (CLAHE) and Homomorphic filter to improve vein image. Our intelligent technique is based on the following intelligent algorithms, namely; principal component analysis (PCA) algorithm for feature extraction and k-Nearest Neighbors (K-NN) classifier for matching operation. The database selected was Bosphorus Hand Vein Database, CASIA Multi-Spectral Palmprint Image Database V1.0 (CASIA database) and the Shandong University Machine Learning and Applications Homologous Multi-modal Traits (SDUMLA-HMT). The accomplished result for the fusion of three biometric traits was Correct Recognition Rate (CRR) is 99.21% with False Reject Rate (FRR) 0.04%.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116094576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multi-Tenant RDBMS Migration in the Cloud Environment 云环境下的多租户RDBMS迁移

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-31 DOI: 10.21608/ijicis.2021.77309.1093

A. Raouf, Alshaimaa Abo-Alian, N. Badr

In a multi-tenant business environment, tenants share the same applications and databases to store their data. Due to the widespread use of a multi-tenant environment, the service providers face difficult challenges daily. These challenges are condensed in how to guarantee the quality of service provided to tenants, which are documented in a formal document known as a Service Level Agreement (SLA). In addition, SLA should consider the irregular patterns of workload of tenants which may affect the level of guarantee. In this research, an Enhanced Multi-Tenant Database Management System (EMT DBMS) is proposed. In addition, an Enhanced Multi-tenant Migration Algorithm called EMT-M is presented, which aims to migrate the violated tenants depending on both the number of SLA violations and the variance rate. Experimental results prove that the proposed EMT-M algorithm is ideal for migrating violated tenants, as it reduces the number of SLA violations compared to previous migration algorithms.

在多租户业务环境中，租户共享相同的应用程序和数据库来存储数据。由于多租户环境的广泛使用，服务提供商每天都面临着困难的挑战。这些挑战集中在如何保证向租户提供的服务质量上，这些都记录在称为服务水平协议(SLA)的正式文档中。此外，SLA应该考虑租户工作负载的不规则模式，这可能会影响保证级别。本文提出了一种增强型多租户数据库管理系统(EMT DBMS)。此外，还提出了一种称为EMT-M的增强型多租户迁移算法，该算法的目的是根据SLA违规的数量和变异率来迁移违规的租户。实验结果证明，EMT-M算法是迁移违规租户的理想选择，因为与以前的迁移算法相比，它减少了SLA违规的数量。

引用次数: 0

A REVIEW ON AUTISM SPECTRUM DISORDER DIAGNOSIS USING TASK-BASED FUNCTIONAL MRI 任务型功能mri诊断自闭症谱系障碍的研究进展

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-19 DOI: 10.21608/IJICIS.2021.75525.1090

Reem T. Haweel, Noha A. Seada, S. Ghoniemy, A. El-Baz

Autism spectrum disorder (ASD) is a neurodevelopmental disorder associated with impairments in social and lingual abilities. The current gold standard for diagnosing is the autism diagnostic observation schedule (ADOS) plus expert clinical judgement. The actual cause for autism is still unknown. Early ASD diagnosis is critical for conducting personalized treatment plans and can lead to significant development enhancements. Machine learning techniques, specially deep learning, have been widely incorporated in attempts to develop objective computer-aided technologies to diagnose autism with brain imaging modalities. Task-based functional magnetic resonance imaging (TfMRI) is a brain imaging modality that reveals functional activity of the brain in response to different experiments to study the effects of a brain disease or disorder. This study provides a comprehensive review on researches that deploy traditional machine learning and deep learning techniques in diagnosing ASD based on TfMRI. Classification results manifest that TfMRI holds early autism biomarkers and suggest future research to establish multi-modal studies that integrate TfMRI with structural, functional, clinical and gnomic data with higher number of participating subjects.

自闭症谱系障碍(ASD)是一种与社交和语言能力受损相关的神经发育障碍。目前诊断的金标准是自闭症诊断观察表(ADOS)加上专家临床判断。自闭症的真正原因尚不清楚。ASD的早期诊断对于制定个性化的治疗计划至关重要，并能显著促进发育。机器学习技术，特别是深度学习，已被广泛应用于开发客观的计算机辅助技术，以脑成像方式诊断自闭症。基于任务的功能性磁共振成像(TfMRI)是一种脑成像方式，它揭示了大脑对不同实验的反应，以研究大脑疾病或紊乱的影响。本文综述了基于TfMRI的传统机器学习和深度学习技术在ASD诊断中的研究进展。分类结果表明，TfMRI具有早期自闭症生物标志物，并建议未来的研究建立多模式的研究，将TfMRI与更多的参与受试者的结构、功能、临床和基因组数据相结合。

{"title":"A REVIEW ON AUTISM SPECTRUM DISORDER DIAGNOSIS USING TASK-BASED FUNCTIONAL MRI","authors":"Reem T. Haweel, Noha A. Seada, S. Ghoniemy, A. El-Baz","doi":"10.21608/IJICIS.2021.75525.1090","DOIUrl":"https://doi.org/10.21608/IJICIS.2021.75525.1090","url":null,"abstract":"Autism spectrum disorder (ASD) is a neurodevelopmental disorder associated with impairments in social and lingual abilities. The current gold standard for diagnosing is the autism diagnostic observation schedule (ADOS) plus expert clinical judgement. The actual cause for autism is still unknown. Early ASD diagnosis is critical for conducting personalized treatment plans and can lead to significant development enhancements. Machine learning techniques, specially deep learning, have been widely incorporated in attempts to develop objective computer-aided technologies to diagnose autism with brain imaging modalities. Task-based functional magnetic resonance imaging (TfMRI) is a brain imaging modality that reveals functional activity of the brain in response to different experiments to study the effects of a brain disease or disorder. This study provides a comprehensive review on researches that deploy traditional machine learning and deep learning techniques in diagnosing ASD based on TfMRI. Classification results manifest that TfMRI holds early autism biomarkers and suggest future research to establish multi-modal studies that integrate TfMRI with structural, functional, clinical and gnomic data with higher number of participating subjects.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115908260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning 基于端到端深度学习的阿拉伯语语音识别数据增强

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-19 DOI: 10.21608/IJICIS.2021.73581.1086

Hamzah A. Alsayadi, A. Abdelhamid, I. Hegazy, Zaki Taha

End-to-end deep learning approach has greatly enhanced the performance of speech recognition systems. With deep learning techniques, the overfitting stills the main problem with a little data. Data augmentation is a suitable solution for the overfitting problem, which is adopted to improve the quantity of training data and enhance robustness of the models. In this paper, we investigate data augmentation method for enhancing Arabic automatic speech recognition (ASR) based on end-to-end deep learning. Data augmentation is applied on original corpus for increasing training data by applying noise adaptation, pitch-shifting, and speed transformation. An CNN-LSTM and attention-based encoder-decoder method are included in building the acoustic model and decoding phase. This method is considered as state-of-art in end-to-end deep learning, and to the best of our knowledge, there is no prior research employed data augmentation for CNN-LSTM and attention-based model in Arabic ASR systems. In addition, the language model is built using RNN-LM and LSTM-LM methods. The Standard Arabic Single Speaker Corpus (SASSC) without diacritics is used as an original corpus. Experimental results show that applying data augmentation improved word error rate (WER) when compared with the same approach without data augmentation. The achieved average reduction in WER is 4.55%.

端到端深度学习方法极大地提高了语音识别系统的性能。对于深度学习技术，过度拟合仍然是少量数据的主要问题。数据增强是解决过拟合问题的一种合适的方法，可以提高训练数据的数量，增强模型的鲁棒性。本文研究了基于端到端深度学习的阿拉伯语自动语音识别(ASR)数据增强方法。对原始语料库进行数据增强，通过噪声自适应、变速、速度变换等方法增加训练数据。声学模型的建立和解码阶段采用了CNN-LSTM和基于注意的编码器-解码器方法。该方法被认为是端到端深度学习领域的最新技术，据我们所知，目前还没有针对CNN-LSTM和基于注意的阿拉伯语ASR系统模型采用数据增强的研究。此外，采用RNN-LM和LSTM-LM方法建立了语言模型。没有变音符号的标准阿拉伯语单语语料库(SASSC)被用作原始语料库。实验结果表明，与不加数据增强的方法相比，采用数据增强的方法可以提高单词错误率。平均降低了4.55%的水当量。

{"title":"Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning","authors":"Hamzah A. Alsayadi, A. Abdelhamid, I. Hegazy, Zaki Taha","doi":"10.21608/IJICIS.2021.73581.1086","DOIUrl":"https://doi.org/10.21608/IJICIS.2021.73581.1086","url":null,"abstract":"End-to-end deep learning approach has greatly enhanced the performance of speech recognition systems. With deep learning techniques, the overfitting stills the main problem with a little data. Data augmentation is a suitable solution for the overfitting problem, which is adopted to improve the quantity of training data and enhance robustness of the models. In this paper, we investigate data augmentation method for enhancing Arabic automatic speech recognition (ASR) based on end-to-end deep learning. Data augmentation is applied on original corpus for increasing training data by applying noise adaptation, pitch-shifting, and speed transformation. An CNN-LSTM and attention-based encoder-decoder method are included in building the acoustic model and decoding phase. This method is considered as state-of-art in end-to-end deep learning, and to the best of our knowledge, there is no prior research employed data augmentation for CNN-LSTM and attention-based model in Arabic ASR systems. In addition, the language model is built using RNN-LM and LSTM-LM methods. The Standard Arabic Single Speaker Corpus (SASSC) without diacritics is used as an original corpus. Experimental results show that applying data augmentation improved word error rate (WER) when compared with the same approach without data augmentation. The achieved average reduction in WER is 4.55%.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134387555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

EXTRACTING RELATIONSHIPS BETWEEN BIG FIVE MODEL AND PERSONALITY CHARACTERISTICS IN SOCIAL NETWORKS 提取大五模型与社交网络中人格特征之间的关系

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-19 DOI: 10.21608/IJICIS.2021.77015.1092

Mariam Hassanein, S. Rady, Wedad Hussein, Tarek F. Gharib

Recently, researches focused on studying how the Big Five personality traits are manifested on social networks. These researches proved the presence of relationships between the Big Five Personality traits and various social networks features extracted from users’ generated content. In this paper, the relationships between the Big Five personality traits (Openness to experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) and attributes of personality characteristics identified as the Personal Values and Human Needs. These attributes or namely features, are extracted from users’ posts on social media. The relationship between the traits and proposed attributes is investigated through Pearson correlation coefficients. A dataset for 564 Twitter users is used in an experimental study, where findings proved the presence of relevant correlations between the traits and the proposed personality characteristic features. The Conscientiousness, Agreeableness, and Neuroticism traits showed strong relations existence with all of the Personal Values features, while the Openness to experience and Neuroticism traits showed strong correlations with Liberty and Self-expression Needs features consecutively. The proposed study verified the effectiveness of the proposed Personal Values and Human Needs features as indicators for the Big Five personality traits, proving their ability for personality characteristics classification.

最近，研究的重点是研究五大人格特征在社交网络中的表现。这些研究证明了五大人格特征与从用户生成的内容中提取的各种社交网络特征之间存在关系。本文研究了五大人格特征(开放性、严谨性、外向性、宜人性和神经质性)与人格特征属性(即个人价值观和人类需求)之间的关系。这些属性或特征是从用户在社交媒体上的帖子中提取出来的。通过Pearson相关系数研究了性状与建议属性之间的关系。在一项实验研究中使用了564名Twitter用户的数据集，研究结果证明了这些特征与拟议的人格特征之间存在相关关系。尽责性、宜人性和神经质性特征与个人价值特征均存在较强的相关性，而经验开放性和神经质性特征与自由、自我表达需求特征存在较强的相关性。本研究验证了个人价值观特征和人类需求特征作为大五人格特征指标的有效性，证明了它们对人格特征分类的能力。

{"title":"EXTRACTING RELATIONSHIPS BETWEEN BIG FIVE MODEL AND PERSONALITY CHARACTERISTICS IN SOCIAL NETWORKS","authors":"Mariam Hassanein, S. Rady, Wedad Hussein, Tarek F. Gharib","doi":"10.21608/IJICIS.2021.77015.1092","DOIUrl":"https://doi.org/10.21608/IJICIS.2021.77015.1092","url":null,"abstract":"Recently, researches focused on studying how the Big Five personality traits are manifested on social networks. These researches proved the presence of relationships between the Big Five Personality traits and various social networks features extracted from users’ generated content. In this paper, the relationships between the Big Five personality traits (Openness to experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) and attributes of personality characteristics identified as the Personal Values and Human Needs. These attributes or namely features, are extracted from users’ posts on social media. The relationship between the traits and proposed attributes is investigated through Pearson correlation coefficients. A dataset for 564 Twitter users is used in an experimental study, where findings proved the presence of relevant correlations between the traits and the proposed personality characteristic features. The Conscientiousness, Agreeableness, and Neuroticism traits showed strong relations existence with all of the Personal Values features, while the Openness to experience and Neuroticism traits showed strong correlations with Liberty and Self-expression Needs features consecutively. The proposed study verified the effectiveness of the proposed Personal Values and Human Needs features as indicators for the Big Five personality traits, proving their ability for personality characteristics classification.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123434181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Bidirectional Temporal Context Fusion with Bi-Modal Semantic Features using a gating mechanism for Dense Video Captioning 基于门控机制的基于双模态语义特征的双向时间上下文融合

International Journal of Intelligent Computing and Information Sciences

Pub Date : 2021-07-18 DOI: 10.21608/IJICIS.2021.60216.1055

Noorhan Khaled, M. Aref, M. Marey

Dense video captioning involves detecting interesting events and generating textual descriptions for each event in an untrimmed video. Many machine intelligent applications such as video summarization, search and retrieval, automatic video subtitling for supporting blind disabled people, benefit from automated dense captions generator. Most recent works attempted to make use of an encoder-decoder neural network framework which employs a 3D-CNN as an encoder for representing a detected event frames, and an RNN as a decoder for caption generation. They follow an attention based mechanism to learn where to focus in the encoded video frames during caption generation. Although the attention-based approaches have achieved excellent results, they directly link visual features to textual captions and ignore the rich intermediate/high-level video concepts such as people, objects, scenes, and actions. In this paper, we firstly propose to obtain a better event representation that discriminates between events nearly ending at the same time by applying an attention based fusion. Where hidden states from a bi-directional LSTM sequence video encoder, which encodes past and future surrounding context information of a detected event are fused along with its visual (R3D) features. Secondly, we propose to explicitly extract bi-modal semantic concepts (nouns and verbs) from a detected event segment and equilibrate the contributions from the proposed event representation and the semantic concepts dynamically using a gating mechanism while captioning. Experimental results demonstrates that our proposed attention based fusion is better in representing an event for captioning. Also involving semantic concepts improves captioning performance.

密集视频字幕包括检测有趣的事件，并为未修剪视频中的每个事件生成文本描述。许多机器智能应用，如视频摘要、搜索和检索、支持盲人残疾人的自动视频字幕，都受益于自动化密集字幕生成器。最近的工作尝试使用编码器-解码器神经网络框架，该框架使用3D-CNN作为表示检测到的事件帧的编码器，并使用RNN作为生成标题的解码器。他们遵循一种基于注意力的机制来学习在字幕生成过程中在编码视频帧中关注的位置。尽管基于注意力的方法取得了优异的效果，但它们直接将视觉特征与文本标题联系起来，而忽略了丰富的中/高级视频概念，如人、物体、场景和动作。在本文中，我们首先提出了一种基于注意力融合的方法来获得一个更好的事件表示，该事件表示可以区分同时接近结束的事件。其中，来自双向LSTM序列视频编码器的隐藏状态与其视觉(R3D)特征融合在一起，该编码器编码检测事件的过去和未来周围上下文信息。其次，我们提出从检测到的事件片段中显式提取双模态语义概念(名词和动词)，并在字幕时使用门控机制动态平衡所提出的事件表示和语义概念的贡献。实验结果表明，我们提出的基于注意力的融合在表示事件标题方面效果较好。此外，涉及语义概念可以提高字幕的性能。

{"title":"Bidirectional Temporal Context Fusion with Bi-Modal Semantic Features using a gating mechanism for Dense Video Captioning","authors":"Noorhan Khaled, M. Aref, M. Marey","doi":"10.21608/IJICIS.2021.60216.1055","DOIUrl":"https://doi.org/10.21608/IJICIS.2021.60216.1055","url":null,"abstract":"Dense video captioning involves detecting interesting events and generating textual descriptions for each event in an untrimmed video. Many machine intelligent applications such as video summarization, search and retrieval, automatic video subtitling for supporting blind disabled people, benefit from automated dense captions generator. Most recent works attempted to make use of an encoder-decoder neural network framework which employs a 3D-CNN as an encoder for representing a detected event frames, and an RNN as a decoder for caption generation. They follow an attention based mechanism to learn where to focus in the encoded video frames during caption generation. Although the attention-based approaches have achieved excellent results, they directly link visual features to textual captions and ignore the rich intermediate/high-level video concepts such as people, objects, scenes, and actions. In this paper, we firstly propose to obtain a better event representation that discriminates between events nearly ending at the same time by applying an attention based fusion. Where hidden states from a bi-directional LSTM sequence video encoder, which encodes past and future surrounding context information of a detected event are fused along with its visual (R3D) features. Secondly, we propose to explicitly extract bi-modal semantic concepts (nouns and verbs) from a detected event segment and equilibrate the contributions from the proposed event representation and the semantic concepts dynamically using a gating mechanism while captioning. Experimental results demonstrates that our proposed attention based fusion is better in representing an event for captioning. Also involving semantic concepts improves captioning performance.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115241037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0