中国图象图形学报最新文献

英文中文

Responses to Sad Emotion in Autistic and Normal Developing Children: Is There a Difference? 自闭症儿童和正常发育儿童对悲伤情绪的反应:有区别吗?

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.40-46

Mohamed Basel Almourad, Emad Bataineh, Zelal Wattar

This paper describes how the gazing pattern differ between the responses of Normal Developing (ND) and Autistic (AP) children to sad emotion. We employed an eye tracking technology to collect and track the participants’ eye movements by showing a dynamic stimulus (video) that showed a gradual transition from pale emotions to melancholy facial expressions in both female and male faces. The location of the child's gaze in the stimulus was the focus of our data analysis. We deduced that there was a distinction between the two groups based on this. ND children predominantly concentrated on the eyes and mouth region of both male and female sad faces, but AP children showed no interest in these areas by glancing away from the stimuli faces. Based on the findings, an ideal eye tracking model for early ASD diagnosis can be constructed. This will aid in the early treatment of Autism children as well as the development of socio-cognitive skills.

本文研究了正常发育儿童(ND)和自闭症儿童(AP)对悲伤情绪的凝视方式的差异。我们采用了眼动追踪技术，通过展示一个动态刺激(视频)来收集和跟踪参与者的眼球运动，这个动态刺激(视频)显示了女性和男性面部从苍白的情绪逐渐过渡到忧郁的面部表情。孩子的目光在刺激中的位置是我们数据分析的重点。根据这一点，我们推断出两组人之间存在差异。ND儿童主要将注意力集中在男性和女性悲伤面孔的眼睛和嘴巴区域，而AP儿童则对这些区域不感兴趣，目光从刺激面部移开。在此基础上，可以构建较为理想的ASD早期诊断眼动追踪模型。这将有助于自闭症儿童的早期治疗以及社会认知技能的发展。

引用次数: 3

Evaluation of Transfer Learning for Handwritten Character Classification Using Small Training Samples 小样本手写体字符分类迁移学习评价

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.21-25

Y. Mitani, Naoki Yamaguchi, Y. Fujita, Y. Hamamoto

In pattern recognition fields, it is worthwhile to develop a pattern recognition system that hears one and knows ten. Recently, classification of printed characters that are the same fonts is almost possible, but classification of handwritten characters is still difficult. On the other hand, there are a large number of writing systems in the world, and there is a need for efficient character classification even with a small sample. Deep learning is one of the most effective approaches for image recognition. Despite this, deep learning causes overtrains easily, particularly when the number of training samples is small. For this reason, deep learning requires a large number of training samples. However, in a practical pattern recognition problem, the number of training samples is usually limited. One method for overcoming this situation is the use of transfer learning, which is pretrained by many samples. In this study, we evaluate the generalization performance of transfer learning for handwritten character classification using a small training sample size. We explore transfer learning using a fine-tuning to fit a small training sample. The experimental results show that transfer learning was more effective for handwritten character classification than convolution neural networks. Transfer learning is expected to be one method that can be used to design a pattern recognition system that works effectively even with a small sample.

在模式识别领域，开发一种“听一知十”的模式识别系统是很有价值的。最近，对相同字体的印刷字符进行分类几乎是可能的，但对手写字符进行分类仍然很困难。另一方面，世界上有大量的书写系统，即使样本很小，也需要有效的字符分类。深度学习是图像识别最有效的方法之一。尽管如此，深度学习很容易导致过度训练，特别是在训练样本数量很少的情况下。因此，深度学习需要大量的训练样本。然而，在实际的模式识别问题中，训练样本的数量通常是有限的。克服这种情况的一种方法是使用迁移学习，它是由许多样本预训练的。在本研究中，我们使用小的训练样本量来评估迁移学习在手写体字符分类中的泛化性能。我们通过微调来适应一个小的训练样本来探索迁移学习。实验结果表明，与卷积神经网络相比，迁移学习对手写体字符分类更有效。迁移学习有望成为一种可用于设计即使在小样本下也能有效工作的模式识别系统的方法。

{"title":"Evaluation of Transfer Learning for Handwritten Character Classification Using Small Training Samples","authors":"Y. Mitani, Naoki Yamaguchi, Y. Fujita, Y. Hamamoto","doi":"10.18178/joig.11.1.21-25","DOIUrl":"https://doi.org/10.18178/joig.11.1.21-25","url":null,"abstract":"In pattern recognition fields, it is worthwhile to develop a pattern recognition system that hears one and knows ten. Recently, classification of printed characters that are the same fonts is almost possible, but classification of handwritten characters is still difficult. On the other hand, there are a large number of writing systems in the world, and there is a need for efficient character classification even with a small sample. Deep learning is one of the most effective approaches for image recognition. Despite this, deep learning causes overtrains easily, particularly when the number of training samples is small. For this reason, deep learning requires a large number of training samples. However, in a practical pattern recognition problem, the number of training samples is usually limited. One method for overcoming this situation is the use of transfer learning, which is pretrained by many samples. In this study, we evaluate the generalization performance of transfer learning for handwritten character classification using a small training sample size. We explore transfer learning using a fine-tuning to fit a small training sample. The experimental results show that transfer learning was more effective for handwritten character classification than convolution neural networks. Transfer learning is expected to be one method that can be used to design a pattern recognition system that works effectively even with a small sample.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"82 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77906336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient Hybrid Algorithm for Human Action Recognition 人体动作识别的高效混合算法

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.72-81

Mostafa A. Abdelrazik, A. Zekry, W. A. Mohamed

Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attempt to obtain improved results in motion recognition through video, we present in this paper a combined algorithm, which is divided into two main parts, CNN and RNN. In the first part there is a preprocessing stage to make the video frame suitable for the input of both CNN networks which consist of a fusion of Inception-ResNet-V2 and GoogleNet to obtain activations, with the previously trained wights in Inception-ResNet-V2 and GoogleNet and then passed to a deep Gated Recurrent Units (GRU) connected to a fully connected SoftMax layer to recognize and distinguish the human action in the video. The results show that the proposed algorithm gives better accuracy of 97.97% with the UCF101 dataset and 73.12% in the hdmb51 data set compared to those present in the related literature.

最近，由于许多领域依赖于人工智能的应用程序的多样性，研究人员一直在寻求通过视频识别人类行为的理想方法。总的来说，方法分为传统方法和深度学习方法，它们在计算机视觉领域提供了质的飞跃。卷积神经网络(CNN)和循环神经网络(RNN)是处理图像和视频最常用的算法。研究人员将这两种算法结合起来，在许多研究中寻找最佳结果。为了在视频运动识别中获得更好的效果，本文提出了一种组合算法，该算法分为CNN和RNN两个主要部分。在第一部分中，有一个预处理阶段，使视频帧适合两个CNN网络的输入，由Inception-ResNet-V2和GoogleNet融合组成，以获得激活，然后将之前在Inception-ResNet-V2和GoogleNet中训练的权重传递给与全连接的SoftMax层连接的深度门控循环单元(GRU)，以识别和区分视频中的人类动作。结果表明，与现有文献相比，该算法在UCF101数据集上的准确率为97.97%，在hdmb51数据集上的准确率为73.12%。

{"title":"Efficient Hybrid Algorithm for Human Action Recognition","authors":"Mostafa A. Abdelrazik, A. Zekry, W. A. Mohamed","doi":"10.18178/joig.11.1.72-81","DOIUrl":"https://doi.org/10.18178/joig.11.1.72-81","url":null,"abstract":"Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attempt to obtain improved results in motion recognition through video, we present in this paper a combined algorithm, which is divided into two main parts, CNN and RNN. In the first part there is a preprocessing stage to make the video frame suitable for the input of both CNN networks which consist of a fusion of Inception-ResNet-V2 and GoogleNet to obtain activations, with the previously trained wights in Inception-ResNet-V2 and GoogleNet and then passed to a deep Gated Recurrent Units (GRU) connected to a fully connected SoftMax layer to recognize and distinguish the human action in the video. The results show that the proposed algorithm gives better accuracy of 97.97% with the UCF101 dataset and 73.12% in the hdmb51 data set compared to those present in the related literature.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74601985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Development of a Previsualization Proxy Plug-in Tool 一个可视化代理插件工具的开发

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.26-31

Balgum Song

Previsualization, also known as previs in the digital content industry, is becoming increasingly important. Previsualization in animation, movies and visual effects (VFX) can enhance ideas and creative story production. Thus, unnecessary expenses can be minimized while output quality can be improved. It is crucial to produce proxy modeling that can implement animation quickly during the previsualization production stage. The process is often ignored because additional procedures are needed, and it takes a relatively long time for an unskilled person to produce proxy modeling. Therefore, it is imperative to develop a proxy plug-in tool to simplify the motion process using an easy method. A new method was developed for creating a bounding box by attaching it to each joint, differentiating it from the existing high-poly to low-poly working process. This unique proxy plug-in development method allows us to proceed with the motion process in fewer steps, with better operation, steady speed performance, and a precise shape to work efficiently in previsualization. Using the proxy plug-in tool to perform the motion may be a solution for creating easy access to previsualization and story production.

预览，在数字内容行业也被称为预览，正变得越来越重要。动画，电影和视觉效果(VFX)的预览可以增强想法和创意故事的制作。这样，在提高产出质量的同时，可以最大限度地减少不必要的开支。在预可视化制作阶段，生成能够快速实现动画的代理建模是至关重要的。这个过程经常被忽略，因为需要额外的过程，而且对于一个不熟练的人来说，生成代理建模需要相当长的时间。因此，开发一个代理插件工具，以一种简单的方法简化运动过程是势在必行的。开发了一种通过将边界框附加到每个关节来创建边界框的新方法，将其与现有的高聚到低聚的工作过程区分开来。这种独特的代理插件开发方法使我们能够以更少的步骤进行运动过程，具有更好的操作，稳定的速度性能和精确的形状，以便在预览中有效地工作。使用代理插件工具执行动作可能是创建易于访问的预览和故事制作的解决方案。

引用次数: 0

Optical Flow-Based Algorithm Analysis to Detect Human Emotion from Eye Movement-Image Data 基于光流的眼动图像情感检测算法分析

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.53-60

T. T. Zizi, S. Ramli, Muslihah Wook, M. Shukran

One of the popular methods for the recognition of human emotions such as happiness, sadness and shock is based on the movement of facial features. Motion vectors that show these movements can be calculated by using optical flow algorithms. In this method, for detecting emotions, the resulted set of motion vectors is compared with a standard facial movement template caused by human emotional changes. In this paper, a new method is introduced to compute the quantity of likeness towards a particular emotion to make decisions based on the importance of obtained vectors from an optical flow approach. The current study uses a feature point tracking technique separately applied to the five facial image regions (eyebrows, eyes, and mouth) to identify basic emotions. Primarily, this research will be focusing on eye movement regions. For finding the vectors, one of the efficient optical flow methods is using the pre-experiment as explained further below.

一种流行的识别人类情绪的方法，如快乐、悲伤和震惊，是基于面部特征的运动。显示这些运动的运动向量可以通过使用光流算法来计算。在该方法中，将得到的运动向量集与人类情绪变化引起的标准面部运动模板进行比较，以检测情绪。本文提出了一种计算特定情感相似度的新方法，并基于光流法获得的向量的重要性进行决策。目前的研究使用了一种特征点跟踪技术，分别应用于面部图像的五个区域(眉毛、眼睛和嘴巴)来识别基本情绪。首先，这项研究将集中在眼球运动区域。为了找到矢量，有效的光流方法之一是使用预实验，如下所述。

引用次数: 5

Plant Species Classification Using Leaf Edge Feature Combination with Morphological Transformations and SIFT Key Point 叶缘特征结合形态变换和SIFT关键点的植物物种分类

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.91-97

Jiraporn Thomkaew, Sarun Intakosum

This paper presents a new approach to plant classification by using leaf edge feature combination with Morphological Transformations and defining key points on leaf edge with SIFT. There are three steps in the process. Image preprocessing, feature extraction, and image classification. In the image preprocessing step, image noise is removed with Morphological Transformations and leaf edge detect with Canny Edge Detection. The leaf edge is identified with SIFT, and the plant leaf feature was extracted by CNN according to the proposed method. The plant leaves are then classified by random forest. Experiments were performed on the PlantVillage dataset of 10 classes, 5 classes of healthy leaves, and 5 classes of diseased leaves. The results showed that the proposed method was able to classify plant species more accurately than using features based on leaf shape and texture. The proposed method has an accuracy of 95.62%.

提出了一种将叶缘特征与形态学变换相结合，利用SIFT定义叶缘关键点的植物分类新方法。这个过程有三个步骤。图像预处理，特征提取，图像分类。在图像预处理步骤中，使用形态学变换去除图像噪声，使用Canny边缘检测检测叶子边缘。利用SIFT对叶片边缘进行识别，并根据该方法对植物叶片特征进行CNN提取。然后用随机森林法对植物叶片进行分类。实验在PlantVillage的10类、5类健康叶片和5类患病叶片数据集上进行。结果表明，该方法能够比基于叶片形状和纹理的特征更准确地分类植物物种。该方法的准确率为95.62%。

引用次数: 1

Deep Learning in Grapevine Leaves Varieties Classification Based on Dense Convolutional Network 基于密集卷积网络的葡萄叶品种深度学习分类

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.98-103

H. A. Ahmed, Hersh M. Hama, S. I. Jalal, M. Ahmed

Grapevine leaves are utilized worldwide in a vast range of traditional cuisines. As their price and flavor differ from kind to kind, recognizing various species of grapevine leaves is becoming an essential task. In addition, the differentiation between grapevine leaf types by human sense is difficult and time-consuming. Thus, building a machine learning model to automate the grapevine leaf classification is highly beneficial. Therefore, this is the primary focus of this work. This paper uses a CNN-based model to classify grape leaves by adapting DenseNet201. This study investigates the impact of layer freezing on the performance of DenseNet201 throughout the fine-tuning process. This work used a public dataset consist of 500 images with 5 different classes (100 images per class). Several data augmentation methods used to expand the training set. The proposed CNN model, named DenseNet-30, outperformed the existing grape leaf classification work that the dataset borrowed from by achieving 98% overall accuracy.

葡萄藤叶在世界范围内广泛用于传统美食。由于葡萄叶的价格和风味各不相同，因此识别不同种类的葡萄叶已成为一项必不可少的任务。此外，用人的感官来区分葡萄叶的类型是困难和耗时的。因此，建立一个机器学习模型来实现葡萄藤叶片的自动分类是非常有益的。因此，这是本工作的首要重点。本文采用基于cnn的模型，采用DenseNet201对葡萄叶片进行分类。本研究探讨了在整个微调过程中，层冻结对DenseNet201性能的影响。这项工作使用了一个公共数据集，该数据集由5个不同类别的500张图像组成(每个类别100张图像)。几种用于扩展训练集的数据增强方法。提出的CNN模型名为DenseNet-30，其总体准确率达到98%，优于该数据集借鉴的现有葡萄叶分类工作。

引用次数: 5

Pineapple Sweetness Classification Using Deep Learning Based on Pineapple Images 基于菠萝图像的深度学习菠萝甜度分类

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.47-52

Sarunya Kanjanawattana, Worawit Teerawatthanaprapha, Panchalee Praneetpholkrang, G. Bhakdisongkhram, Suchada Weeragulpiriya

In Thailand, the pineapple is a valuable crop whose price is determined by its sweetness. An optical refractometer or another technique that requires expert judgment can be used to determine a fruit's sweetness. Furthermore, determining the sweetness of each fruit takes time and effort. This study employed the Alexnet deep learning model to categorize pineapple sweetness levels based on physical attributes shown in images. The dataset was classified into four classes, i.e., M1 to M4, and sorted in ascending order by sweetness level. The dataset was divided into two parts: training and testing datasets. Training accounted for 80% of the dataset while testing accounted for 20%. This study's experiments were repeated five times, each with a distinct epoch and working with data that had been prepared. According to the experiment, the Alexnet model produced the greatest results when trained with balancing data across 10 epochs and 120 figures per class. The model's accuracy and F1 score were 91.78% and 92.31%, respectively.

在泰国，菠萝是一种有价值的作物，其价格取决于它的甜度。光学折射计或其他需要专家判断的技术可以用来确定水果的甜度。此外，确定每种水果的甜度需要时间和精力。这项研究采用了Alexnet深度学习模型，根据图像中显示的物理属性对菠萝的甜度进行分类。将数据集分为M1 ~ M4四类，按甜度高低从小到大排序。数据集分为两部分:训练数据集和测试数据集。训练占数据集的80%，测试占20%。这项研究的实验重复了五次，每次都有一个不同的时代，并使用已经准备好的数据。根据实验，Alexnet模型在使用10个时代的平衡数据和每类120个数字进行训练时产生了最好的结果。模型的准确率为91.78%，F1评分为92.31%。

引用次数: 2

Application of Medical Image 3D Visualization Web Platform in Auxiliary Diagnosis and Preoperative Planning 医学影像三维可视化Web平台在辅助诊断和术前规划中的应用

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.32-39

Shengyu Bai, Chenxin Ma, Xinjun Wang, Shaolong Zhou, Hongyu Jiang, Ling Ma, Huiqin Jiang

Three-dimensional visualization of medical image data can enable doctors to observe images from more angles and higher dimensions. It is of great significance for doctors to assist in diagnosis and preoperative planning. Most 3D visualization systems are based on desktop applications, which are too dependent on hardware and operating system. This makes it difficult to use across platforms and maintain. Web-based systems tend to have limited capabilities. To this end, we developed a web application, which not only provides DICOM (Digital Imaging and Communications in Medicine) image browsing and annotation functions, but also provides three-dimensional post-processing functions of multiplanar reconstruction, volume rendering, lung parenchyma segmentation and brain MRI (Magnetic Resonance Imaging) analysis. In order to improve the rendering speed, we use the Marching Cube algorithm for 3D reconstruction in the background in an asynchronous way, and save the reconstructed model as glTF (GL Transmission Format). At the same time, Draco compression algorithm is used to optimize the glTF model to achieve more efficient rendering. After performance evaluation, the system reconstructed a CT (Computed Tomography) series of 242 slices and the optimized model was only 6.37mb with a rendering time of less than 2.5s. Three-dimensional visualization of the lung parenchyma clearly shows the volume, location, and shape of pulmonary nodules. The segmentation and reconstruction of different brain tissues can reveal the spatial three-dimensional structure and adjacent relationship of glioma in the brain, which has great application value in auxiliary diagnosis and preoperative planning.

医学图像数据的三维可视化可以使医生从更多的角度和更高的维度观察图像。这对医生协助诊断和术前规划具有重要意义。大多数3D可视化系统都是基于桌面应用程序，过于依赖硬件和操作系统。这使得跨平台使用和维护变得困难。基于web的系统往往具有有限的功能。为此，我们开发了一个web应用程序，该应用程序不仅提供DICOM (Digital Imaging and Communications in Medicine)图像浏览和注释功能，还提供多平面重建、体绘制、肺实质分割和脑MRI (Magnetic Resonance Imaging)分析等三维后处理功能。为了提高渲染速度，我们采用Marching Cube算法在后台异步进行三维重建，并将重建模型保存为glTF (GL Transmission Format)格式。同时，采用Draco压缩算法对glTF模型进行优化，实现更高效的渲染。经过性能评估，系统重构了242个CT (Computed Tomography)切片序列，优化后的模型仅为6.37mb，渲染时间小于2.5s。肺实质三维影像清晰显示肺结节的体积、位置和形状。通过对不同脑组织的分割重建，可以揭示脑内胶质瘤的空间三维结构和相邻关系，在辅助诊断和术前规划中具有很大的应用价值。

{"title":"Application of Medical Image 3D Visualization Web Platform in Auxiliary Diagnosis and Preoperative Planning","authors":"Shengyu Bai, Chenxin Ma, Xinjun Wang, Shaolong Zhou, Hongyu Jiang, Ling Ma, Huiqin Jiang","doi":"10.18178/joig.11.1.32-39","DOIUrl":"https://doi.org/10.18178/joig.11.1.32-39","url":null,"abstract":"Three-dimensional visualization of medical image data can enable doctors to observe images from more angles and higher dimensions. It is of great significance for doctors to assist in diagnosis and preoperative planning. Most 3D visualization systems are based on desktop applications, which are too dependent on hardware and operating system. This makes it difficult to use across platforms and maintain. Web-based systems tend to have limited capabilities. To this end, we developed a web application, which not only provides DICOM (Digital Imaging and Communications in Medicine) image browsing and annotation functions, but also provides three-dimensional post-processing functions of multiplanar reconstruction, volume rendering, lung parenchyma segmentation and brain MRI (Magnetic Resonance Imaging) analysis. In order to improve the rendering speed, we use the Marching Cube algorithm for 3D reconstruction in the background in an asynchronous way, and save the reconstructed model as glTF (GL Transmission Format). At the same time, Draco compression algorithm is used to optimize the glTF model to achieve more efficient rendering. After performance evaluation, the system reconstructed a CT (Computed Tomography) series of 242 slices and the optimized model was only 6.37mb with a rendering time of less than 2.5s. Three-dimensional visualization of the lung parenchyma clearly shows the volume, location, and shape of pulmonary nodules. The segmentation and reconstruction of different brain tissues can reveal the spatial three-dimensional structure and adjacent relationship of glioma in the brain, which has great application value in auxiliary diagnosis and preoperative planning.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86869350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ImECGnet: Cardiovascular Disease Classification from Image-Based ECG Data Using a Multibranch Convolutional Neural Network ImECGnet:使用多分支卷积神经网络从基于图像的心电数据中分类心血管疾病

Q3 Computer Science

中国图象图形学报

Pub Date : 2023-03-01 DOI: 10.18178/joig.11.1.9-14

Amir Ghahremani, C. Lofi

Reliable Cardiovascular Disease (CVD) classification performed by a smart system can assist medical doctors in recognizing heart illnesses in patients more efficiently and effectively. Electrocardiogram (ECG) signals are an important diagnostic tool as they are already available early in the patients’ health diagnosis process and contain valuable indicators for various CVDs. Most ECG processing methods represent ECG data as a time series, often as a matrix with each row containing the measurements of a sensor lead; and/or the transforms of such time series like wavelet power spectrums. While methods processing such time-series data have been shown to work well in benchmarks, they are still highly dependent on factors like input noise and sequence length, and cannot always correlate lead data from different sensors well. In this paper, we propose to represent ECG signals incorporating all lead data plotted as a single image, an approach not yet explored by literature. We will show that such an image representation combined with our newly proposed convolutional neural network specifically designed for CVD classification can overcome the aforementioned shortcomings. The proposed (Convolutional Neural Network) CNN is designed to extract features representing both the proportional relationships of different leads to each other and the characteristics of each lead separately. Empirical validation on the publicly available PTB, MIT-BIH, and St.-Petersburg benchmark databases shows that the proposed method outperforms time seriesbased state-of-the-art approaches, yielding classification accuracy of 97.91%, 99.62%, and 98.70%, respectively.

通过智能系统进行可靠的心血管疾病(CVD)分类，可以帮助医生更有效地识别患者的心脏病。心电图(ECG)信号是一种重要的诊断工具，因为它在患者健康诊断过程的早期就可以获得，并且包含各种心血管疾病的有价值的指标。大多数心电处理方法将心电数据表示为时间序列，通常作为矩阵，每行包含传感器引线的测量值;或者时间序列的变换比如小波功率谱。虽然处理此类时间序列数据的方法在基准测试中表现良好，但它们仍然高度依赖于输入噪声和序列长度等因素，并且不能总是将来自不同传感器的引线数据很好地关联起来。在本文中，我们建议将合并所有导联数据的心电信号表示为单个图像，这是一种尚未被文献探索的方法。我们将证明这种图像表示与我们新提出的专门为CVD分类设计的卷积神经网络相结合可以克服上述缺点。本文提出的卷积神经网络(Convolutional Neural Network) CNN旨在提取既代表不同引线之间的比例关系的特征，又分别代表每个引线的特征。在公开可用的PTB、MIT-BIH和st . petersburg基准数据库上的实证验证表明，所提出的方法优于基于时间序列的最先进方法，分类准确率分别为97.91%、99.62%和98.70%。

{"title":"ImECGnet: Cardiovascular Disease Classification from Image-Based ECG Data Using a Multibranch Convolutional Neural Network","authors":"Amir Ghahremani, C. Lofi","doi":"10.18178/joig.11.1.9-14","DOIUrl":"https://doi.org/10.18178/joig.11.1.9-14","url":null,"abstract":"Reliable Cardiovascular Disease (CVD) classification performed by a smart system can assist medical doctors in recognizing heart illnesses in patients more efficiently and effectively. Electrocardiogram (ECG) signals are an important diagnostic tool as they are already available early in the patients’ health diagnosis process and contain valuable indicators for various CVDs. Most ECG processing methods represent ECG data as a time series, often as a matrix with each row containing the measurements of a sensor lead; and/or the transforms of such time series like wavelet power spectrums. While methods processing such time-series data have been shown to work well in benchmarks, they are still highly dependent on factors like input noise and sequence length, and cannot always correlate lead data from different sensors well. In this paper, we propose to represent ECG signals incorporating all lead data plotted as a single image, an approach not yet explored by literature. We will show that such an image representation combined with our newly proposed convolutional neural network specifically designed for CVD classification can overcome the aforementioned shortcomings. The proposed (Convolutional Neural Network) CNN is designed to extract features representing both the proportional relationships of different leads to each other and the characteristics of each lead separately. Empirical validation on the publicly available PTB, MIT-BIH, and St.-Petersburg benchmark databases shows that the proposed method outperforms time seriesbased state-of-the-art approaches, yielding classification accuracy of 97.91%, 99.62%, and 98.70%, respectively.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87574480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

中国图象图形学报

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀