This paper describes how the gazing pattern differ between the responses of Normal Developing (ND) and Autistic (AP) children to sad emotion. We employed an eye tracking technology to collect and track the participants’ eye movements by showing a dynamic stimulus (video) that showed a gradual transition from pale emotions to melancholy facial expressions in both female and male faces. The location of the child's gaze in the stimulus was the focus of our data analysis. We deduced that there was a distinction between the two groups based on this. ND children predominantly concentrated on the eyes and mouth region of both male and female sad faces, but AP children showed no interest in these areas by glancing away from the stimuli faces. Based on the findings, an ideal eye tracking model for early ASD diagnosis can be constructed. This will aid in the early treatment of Autism children as well as the development of socio-cognitive skills.
{"title":"Responses to Sad Emotion in Autistic and Normal Developing Children: Is There a Difference?","authors":"Mohamed Basel Almourad, Emad Bataineh, Zelal Wattar","doi":"10.18178/joig.11.1.40-46","DOIUrl":"https://doi.org/10.18178/joig.11.1.40-46","url":null,"abstract":"This paper describes how the gazing pattern differ between the responses of Normal Developing (ND) and Autistic (AP) children to sad emotion. We employed an eye tracking technology to collect and track the participants’ eye movements by showing a dynamic stimulus (video) that showed a gradual transition from pale emotions to melancholy facial expressions in both female and male faces. The location of the child's gaze in the stimulus was the focus of our data analysis. We deduced that there was a distinction between the two groups based on this. ND children predominantly concentrated on the eyes and mouth region of both male and female sad faces, but AP children showed no interest in these areas by glancing away from the stimuli faces. Based on the findings, an ideal eye tracking model for early ASD diagnosis can be constructed. This will aid in the early treatment of Autism children as well as the development of socio-cognitive skills.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88245145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.21-25
Y. Mitani, Naoki Yamaguchi, Y. Fujita, Y. Hamamoto
In pattern recognition fields, it is worthwhile to develop a pattern recognition system that hears one and knows ten. Recently, classification of printed characters that are the same fonts is almost possible, but classification of handwritten characters is still difficult. On the other hand, there are a large number of writing systems in the world, and there is a need for efficient character classification even with a small sample. Deep learning is one of the most effective approaches for image recognition. Despite this, deep learning causes overtrains easily, particularly when the number of training samples is small. For this reason, deep learning requires a large number of training samples. However, in a practical pattern recognition problem, the number of training samples is usually limited. One method for overcoming this situation is the use of transfer learning, which is pretrained by many samples. In this study, we evaluate the generalization performance of transfer learning for handwritten character classification using a small training sample size. We explore transfer learning using a fine-tuning to fit a small training sample. The experimental results show that transfer learning was more effective for handwritten character classification than convolution neural networks. Transfer learning is expected to be one method that can be used to design a pattern recognition system that works effectively even with a small sample.
{"title":"Evaluation of Transfer Learning for Handwritten Character Classification Using Small Training Samples","authors":"Y. Mitani, Naoki Yamaguchi, Y. Fujita, Y. Hamamoto","doi":"10.18178/joig.11.1.21-25","DOIUrl":"https://doi.org/10.18178/joig.11.1.21-25","url":null,"abstract":"In pattern recognition fields, it is worthwhile to develop a pattern recognition system that hears one and knows ten. Recently, classification of printed characters that are the same fonts is almost possible, but classification of handwritten characters is still difficult. On the other hand, there are a large number of writing systems in the world, and there is a need for efficient character classification even with a small sample. Deep learning is one of the most effective approaches for image recognition. Despite this, deep learning causes overtrains easily, particularly when the number of training samples is small. For this reason, deep learning requires a large number of training samples. However, in a practical pattern recognition problem, the number of training samples is usually limited. One method for overcoming this situation is the use of transfer learning, which is pretrained by many samples. In this study, we evaluate the generalization performance of transfer learning for handwritten character classification using a small training sample size. We explore transfer learning using a fine-tuning to fit a small training sample. The experimental results show that transfer learning was more effective for handwritten character classification than convolution neural networks. Transfer learning is expected to be one method that can be used to design a pattern recognition system that works effectively even with a small sample.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"82 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77906336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.72-81
Mostafa A. Abdelrazik, A. Zekry, W. A. Mohamed
Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attempt to obtain improved results in motion recognition through video, we present in this paper a combined algorithm, which is divided into two main parts, CNN and RNN. In the first part there is a preprocessing stage to make the video frame suitable for the input of both CNN networks which consist of a fusion of Inception-ResNet-V2 and GoogleNet to obtain activations, with the previously trained wights in Inception-ResNet-V2 and GoogleNet and then passed to a deep Gated Recurrent Units (GRU) connected to a fully connected SoftMax layer to recognize and distinguish the human action in the video. The results show that the proposed algorithm gives better accuracy of 97.97% with the UCF101 dataset and 73.12% in the hdmb51 data set compared to those present in the related literature.
{"title":"Efficient Hybrid Algorithm for Human Action Recognition","authors":"Mostafa A. Abdelrazik, A. Zekry, W. A. Mohamed","doi":"10.18178/joig.11.1.72-81","DOIUrl":"https://doi.org/10.18178/joig.11.1.72-81","url":null,"abstract":"Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attempt to obtain improved results in motion recognition through video, we present in this paper a combined algorithm, which is divided into two main parts, CNN and RNN. In the first part there is a preprocessing stage to make the video frame suitable for the input of both CNN networks which consist of a fusion of Inception-ResNet-V2 and GoogleNet to obtain activations, with the previously trained wights in Inception-ResNet-V2 and GoogleNet and then passed to a deep Gated Recurrent Units (GRU) connected to a fully connected SoftMax layer to recognize and distinguish the human action in the video. The results show that the proposed algorithm gives better accuracy of 97.97% with the UCF101 dataset and 73.12% in the hdmb51 data set compared to those present in the related literature.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74601985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.26-31
Balgum Song
Previsualization, also known as previs in the digital content industry, is becoming increasingly important. Previsualization in animation, movies and visual effects (VFX) can enhance ideas and creative story production. Thus, unnecessary expenses can be minimized while output quality can be improved. It is crucial to produce proxy modeling that can implement animation quickly during the previsualization production stage. The process is often ignored because additional procedures are needed, and it takes a relatively long time for an unskilled person to produce proxy modeling. Therefore, it is imperative to develop a proxy plug-in tool to simplify the motion process using an easy method. A new method was developed for creating a bounding box by attaching it to each joint, differentiating it from the existing high-poly to low-poly working process. This unique proxy plug-in development method allows us to proceed with the motion process in fewer steps, with better operation, steady speed performance, and a precise shape to work efficiently in previsualization. Using the proxy plug-in tool to perform the motion may be a solution for creating easy access to previsualization and story production.
{"title":"Development of a Previsualization Proxy Plug-in Tool","authors":"Balgum Song","doi":"10.18178/joig.11.1.26-31","DOIUrl":"https://doi.org/10.18178/joig.11.1.26-31","url":null,"abstract":"Previsualization, also known as previs in the digital content industry, is becoming increasingly important. Previsualization in animation, movies and visual effects (VFX) can enhance ideas and creative story production. Thus, unnecessary expenses can be minimized while output quality can be improved. It is crucial to produce proxy modeling that can implement animation quickly during the previsualization production stage. The process is often ignored because additional procedures are needed, and it takes a relatively long time for an unskilled person to produce proxy modeling. Therefore, it is imperative to develop a proxy plug-in tool to simplify the motion process using an easy method. A new method was developed for creating a bounding box by attaching it to each joint, differentiating it from the existing high-poly to low-poly working process. This unique proxy plug-in development method allows us to proceed with the motion process in fewer steps, with better operation, steady speed performance, and a precise shape to work efficiently in previsualization. Using the proxy plug-in tool to perform the motion may be a solution for creating easy access to previsualization and story production.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135076283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.53-60
T. T. Zizi, S. Ramli, Muslihah Wook, M. Shukran
One of the popular methods for the recognition of human emotions such as happiness, sadness and shock is based on the movement of facial features. Motion vectors that show these movements can be calculated by using optical flow algorithms. In this method, for detecting emotions, the resulted set of motion vectors is compared with a standard facial movement template caused by human emotional changes. In this paper, a new method is introduced to compute the quantity of likeness towards a particular emotion to make decisions based on the importance of obtained vectors from an optical flow approach. The current study uses a feature point tracking technique separately applied to the five facial image regions (eyebrows, eyes, and mouth) to identify basic emotions. Primarily, this research will be focusing on eye movement regions. For finding the vectors, one of the efficient optical flow methods is using the pre-experiment as explained further below.
{"title":"Optical Flow-Based Algorithm Analysis to Detect Human Emotion from Eye Movement-Image Data","authors":"T. T. Zizi, S. Ramli, Muslihah Wook, M. Shukran","doi":"10.18178/joig.11.1.53-60","DOIUrl":"https://doi.org/10.18178/joig.11.1.53-60","url":null,"abstract":"One of the popular methods for the recognition of human emotions such as happiness, sadness and shock is based on the movement of facial features. Motion vectors that show these movements can be calculated by using optical flow algorithms. In this method, for detecting emotions, the resulted set of motion vectors is compared with a standard facial movement template caused by human emotional changes. In this paper, a new method is introduced to compute the quantity of likeness towards a particular emotion to make decisions based on the importance of obtained vectors from an optical flow approach. The current study uses a feature point tracking technique separately applied to the five facial image regions (eyebrows, eyes, and mouth) to identify basic emotions. Primarily, this research will be focusing on eye movement regions. For finding the vectors, one of the efficient optical flow methods is using the pre-experiment as explained further below.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85310481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.91-97
Jiraporn Thomkaew, Sarun Intakosum
This paper presents a new approach to plant classification by using leaf edge feature combination with Morphological Transformations and defining key points on leaf edge with SIFT. There are three steps in the process. Image preprocessing, feature extraction, and image classification. In the image preprocessing step, image noise is removed with Morphological Transformations and leaf edge detect with Canny Edge Detection. The leaf edge is identified with SIFT, and the plant leaf feature was extracted by CNN according to the proposed method. The plant leaves are then classified by random forest. Experiments were performed on the PlantVillage dataset of 10 classes, 5 classes of healthy leaves, and 5 classes of diseased leaves. The results showed that the proposed method was able to classify plant species more accurately than using features based on leaf shape and texture. The proposed method has an accuracy of 95.62%.
{"title":"Plant Species Classification Using Leaf Edge Feature Combination with Morphological Transformations and SIFT Key Point","authors":"Jiraporn Thomkaew, Sarun Intakosum","doi":"10.18178/joig.11.1.91-97","DOIUrl":"https://doi.org/10.18178/joig.11.1.91-97","url":null,"abstract":"This paper presents a new approach to plant classification by using leaf edge feature combination with Morphological Transformations and defining key points on leaf edge with SIFT. There are three steps in the process. Image preprocessing, feature extraction, and image classification. In the image preprocessing step, image noise is removed with Morphological Transformations and leaf edge detect with Canny Edge Detection. The leaf edge is identified with SIFT, and the plant leaf feature was extracted by CNN according to the proposed method. The plant leaves are then classified by random forest. Experiments were performed on the PlantVillage dataset of 10 classes, 5 classes of healthy leaves, and 5 classes of diseased leaves. The results showed that the proposed method was able to classify plant species more accurately than using features based on leaf shape and texture. The proposed method has an accuracy of 95.62%.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82575874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.98-103
H. A. Ahmed, Hersh M. Hama, S. I. Jalal, M. Ahmed
Grapevine leaves are utilized worldwide in a vast range of traditional cuisines. As their price and flavor differ from kind to kind, recognizing various species of grapevine leaves is becoming an essential task. In addition, the differentiation between grapevine leaf types by human sense is difficult and time-consuming. Thus, building a machine learning model to automate the grapevine leaf classification is highly beneficial. Therefore, this is the primary focus of this work. This paper uses a CNN-based model to classify grape leaves by adapting DenseNet201. This study investigates the impact of layer freezing on the performance of DenseNet201 throughout the fine-tuning process. This work used a public dataset consist of 500 images with 5 different classes (100 images per class). Several data augmentation methods used to expand the training set. The proposed CNN model, named DenseNet-30, outperformed the existing grape leaf classification work that the dataset borrowed from by achieving 98% overall accuracy.
{"title":"Deep Learning in Grapevine Leaves Varieties Classification Based on Dense Convolutional Network","authors":"H. A. Ahmed, Hersh M. Hama, S. I. Jalal, M. Ahmed","doi":"10.18178/joig.11.1.98-103","DOIUrl":"https://doi.org/10.18178/joig.11.1.98-103","url":null,"abstract":"Grapevine leaves are utilized worldwide in a vast range of traditional cuisines. As their price and flavor differ from kind to kind, recognizing various species of grapevine leaves is becoming an essential task. In addition, the differentiation between grapevine leaf types by human sense is difficult and time-consuming. Thus, building a machine learning model to automate the grapevine leaf classification is highly beneficial. Therefore, this is the primary focus of this work. This paper uses a CNN-based model to classify grape leaves by adapting DenseNet201. This study investigates the impact of layer freezing on the performance of DenseNet201 throughout the fine-tuning process. This work used a public dataset consist of 500 images with 5 different classes (100 images per class). Several data augmentation methods used to expand the training set. The proposed CNN model, named DenseNet-30, outperformed the existing grape leaf classification work that the dataset borrowed from by achieving 98% overall accuracy.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82687624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.18178/joig.11.1.47-52
Sarunya Kanjanawattana, Worawit Teerawatthanaprapha, Panchalee Praneetpholkrang, G. Bhakdisongkhram, Suchada Weeragulpiriya
In Thailand, the pineapple is a valuable crop whose price is determined by its sweetness. An optical refractometer or another technique that requires expert judgment can be used to determine a fruit's sweetness. Furthermore, determining the sweetness of each fruit takes time and effort. This study employed the Alexnet deep learning model to categorize pineapple sweetness levels based on physical attributes shown in images. The dataset was classified into four classes, i.e., M1 to M4, and sorted in ascending order by sweetness level. The dataset was divided into two parts: training and testing datasets. Training accounted for 80% of the dataset while testing accounted for 20%. This study's experiments were repeated five times, each with a distinct epoch and working with data that had been prepared. According to the experiment, the Alexnet model produced the greatest results when trained with balancing data across 10 epochs and 120 figures per class. The model's accuracy and F1 score were 91.78% and 92.31%, respectively.
{"title":"Pineapple Sweetness Classification Using Deep Learning Based on Pineapple Images","authors":"Sarunya Kanjanawattana, Worawit Teerawatthanaprapha, Panchalee Praneetpholkrang, G. Bhakdisongkhram, Suchada Weeragulpiriya","doi":"10.18178/joig.11.1.47-52","DOIUrl":"https://doi.org/10.18178/joig.11.1.47-52","url":null,"abstract":"In Thailand, the pineapple is a valuable crop whose price is determined by its sweetness. An optical refractometer or another technique that requires expert judgment can be used to determine a fruit's sweetness. Furthermore, determining the sweetness of each fruit takes time and effort. This study employed the Alexnet deep learning model to categorize pineapple sweetness levels based on physical attributes shown in images. The dataset was classified into four classes, i.e., M1 to M4, and sorted in ascending order by sweetness level. The dataset was divided into two parts: training and testing datasets. Training accounted for 80% of the dataset while testing accounted for 20%. This study's experiments were repeated five times, each with a distinct epoch and working with data that had been prepared. According to the experiment, the Alexnet model produced the greatest results when trained with balancing data across 10 epochs and 120 figures per class. The model's accuracy and F1 score were 91.78% and 92.31%, respectively.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85259602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Three-dimensional visualization of medical image data can enable doctors to observe images from more angles and higher dimensions. It is of great significance for doctors to assist in diagnosis and preoperative planning. Most 3D visualization systems are based on desktop applications, which are too dependent on hardware and operating system. This makes it difficult to use across platforms and maintain. Web-based systems tend to have limited capabilities. To this end, we developed a web application, which not only provides DICOM (Digital Imaging and Communications in Medicine) image browsing and annotation functions, but also provides three-dimensional post-processing functions of multiplanar reconstruction, volume rendering, lung parenchyma segmentation and brain MRI (Magnetic Resonance Imaging) analysis. In order to improve the rendering speed, we use the Marching Cube algorithm for 3D reconstruction in the background in an asynchronous way, and save the reconstructed model as glTF (GL Transmission Format). At the same time, Draco compression algorithm is used to optimize the glTF model to achieve more efficient rendering. After performance evaluation, the system reconstructed a CT (Computed Tomography) series of 242 slices and the optimized model was only 6.37mb with a rendering time of less than 2.5s. Three-dimensional visualization of the lung parenchyma clearly shows the volume, location, and shape of pulmonary nodules. The segmentation and reconstruction of different brain tissues can reveal the spatial three-dimensional structure and adjacent relationship of glioma in the brain, which has great application value in auxiliary diagnosis and preoperative planning.
医学图像数据的三维可视化可以使医生从更多的角度和更高的维度观察图像。这对医生协助诊断和术前规划具有重要意义。大多数3D可视化系统都是基于桌面应用程序,过于依赖硬件和操作系统。这使得跨平台使用和维护变得困难。基于web的系统往往具有有限的功能。为此,我们开发了一个web应用程序,该应用程序不仅提供DICOM (Digital Imaging and Communications in Medicine)图像浏览和注释功能,还提供多平面重建、体绘制、肺实质分割和脑MRI (Magnetic Resonance Imaging)分析等三维后处理功能。为了提高渲染速度,我们采用Marching Cube算法在后台异步进行三维重建,并将重建模型保存为glTF (GL Transmission Format)格式。同时,采用Draco压缩算法对glTF模型进行优化,实现更高效的渲染。经过性能评估,系统重构了242个CT (Computed Tomography)切片序列,优化后的模型仅为6.37mb,渲染时间小于2.5s。肺实质三维影像清晰显示肺结节的体积、位置和形状。通过对不同脑组织的分割重建,可以揭示脑内胶质瘤的空间三维结构和相邻关系,在辅助诊断和术前规划中具有很大的应用价值。
{"title":"Application of Medical Image 3D Visualization Web Platform in Auxiliary Diagnosis and Preoperative Planning","authors":"Shengyu Bai, Chenxin Ma, Xinjun Wang, Shaolong Zhou, Hongyu Jiang, Ling Ma, Huiqin Jiang","doi":"10.18178/joig.11.1.32-39","DOIUrl":"https://doi.org/10.18178/joig.11.1.32-39","url":null,"abstract":"Three-dimensional visualization of medical image data can enable doctors to observe images from more angles and higher dimensions. It is of great significance for doctors to assist in diagnosis and preoperative planning. Most 3D visualization systems are based on desktop applications, which are too dependent on hardware and operating system. This makes it difficult to use across platforms and maintain. Web-based systems tend to have limited capabilities. To this end, we developed a web application, which not only provides DICOM (Digital Imaging and Communications in Medicine) image browsing and annotation functions, but also provides three-dimensional post-processing functions of multiplanar reconstruction, volume rendering, lung parenchyma segmentation and brain MRI (Magnetic Resonance Imaging) analysis. In order to improve the rendering speed, we use the Marching Cube algorithm for 3D reconstruction in the background in an asynchronous way, and save the reconstructed model as glTF (GL Transmission Format). At the same time, Draco compression algorithm is used to optimize the glTF model to achieve more efficient rendering. After performance evaluation, the system reconstructed a CT (Computed Tomography) series of 242 slices and the optimized model was only 6.37mb with a rendering time of less than 2.5s. Three-dimensional visualization of the lung parenchyma clearly shows the volume, location, and shape of pulmonary nodules. The segmentation and reconstruction of different brain tissues can reveal the spatial three-dimensional structure and adjacent relationship of glioma in the brain, which has great application value in auxiliary diagnosis and preoperative planning.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86869350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reliable Cardiovascular Disease (CVD) classification performed by a smart system can assist medical doctors in recognizing heart illnesses in patients more efficiently and effectively. Electrocardiogram (ECG) signals are an important diagnostic tool as they are already available early in the patients’ health diagnosis process and contain valuable indicators for various CVDs. Most ECG processing methods represent ECG data as a time series, often as a matrix with each row containing the measurements of a sensor lead; and/or the transforms of such time series like wavelet power spectrums. While methods processing such time-series data have been shown to work well in benchmarks, they are still highly dependent on factors like input noise and sequence length, and cannot always correlate lead data from different sensors well. In this paper, we propose to represent ECG signals incorporating all lead data plotted as a single image, an approach not yet explored by literature. We will show that such an image representation combined with our newly proposed convolutional neural network specifically designed for CVD classification can overcome the aforementioned shortcomings. The proposed (Convolutional Neural Network) CNN is designed to extract features representing both the proportional relationships of different leads to each other and the characteristics of each lead separately. Empirical validation on the publicly available PTB, MIT-BIH, and St.-Petersburg benchmark databases shows that the proposed method outperforms time seriesbased state-of-the-art approaches, yielding classification accuracy of 97.91%, 99.62%, and 98.70%, respectively.
{"title":"ImECGnet: Cardiovascular Disease Classification from Image-Based ECG Data Using a Multibranch Convolutional Neural Network","authors":"Amir Ghahremani, C. Lofi","doi":"10.18178/joig.11.1.9-14","DOIUrl":"https://doi.org/10.18178/joig.11.1.9-14","url":null,"abstract":"Reliable Cardiovascular Disease (CVD) classification performed by a smart system can assist medical doctors in recognizing heart illnesses in patients more efficiently and effectively. Electrocardiogram (ECG) signals are an important diagnostic tool as they are already available early in the patients’ health diagnosis process and contain valuable indicators for various CVDs. Most ECG processing methods represent ECG data as a time series, often as a matrix with each row containing the measurements of a sensor lead; and/or the transforms of such time series like wavelet power spectrums. While methods processing such time-series data have been shown to work well in benchmarks, they are still highly dependent on factors like input noise and sequence length, and cannot always correlate lead data from different sensors well. In this paper, we propose to represent ECG signals incorporating all lead data plotted as a single image, an approach not yet explored by literature. We will show that such an image representation combined with our newly proposed convolutional neural network specifically designed for CVD classification can overcome the aforementioned shortcomings. The proposed (Convolutional Neural Network) CNN is designed to extract features representing both the proportional relationships of different leads to each other and the characteristics of each lead separately. Empirical validation on the publicly available PTB, MIT-BIH, and St.-Petersburg benchmark databases shows that the proposed method outperforms time seriesbased state-of-the-art approaches, yielding classification accuracy of 97.91%, 99.62%, and 98.70%, respectively.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87574480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}