Pub Date : 2023-08-29DOI: 10.1142/s0219467825500263
Manochandar Thenralmanoharan, P. Kumaraguru Diderot
Accurate and rapid detection of Alzheimer’s disease (AD) using magnetic resonance imaging (MRI) gained considerable attention among research workers because of an increased number of current researches being driven by deep learning (DL) methods that have accomplished outstanding outcomes in variety of domains involving medical image analysis. Especially, convolution neural network (CNN) is primarily applied for the analyses of image datasets according to the capability of handling massive unstructured datasets and automatically extracting significant features. Earlier detection is dominant to the success and development interferences, and neuroimaging characterizes the potential regions for earlier diagnosis of AD. The study presents and develops a novel Deep Learning-based Magnetic Resonance Image Segmentation and Classification for AD Diagnosis (DLMRISC-ADD) model. The presented DLMRISC-ADD model mainly focuses on the segmentation of MRI images to detect AD. To accomplish this, the presented DLMRISC-ADD model follows a two-stage process, namely, skull stripping and image segmentation. At the preliminary stage, the presented DLMRISC-ADD model employs U-Net-based skull stripping approach to remove skull regions from the input MRIs. Next, in the second stage, the DLMRISC-ADD model applies QuickNAT model for MRI image segmentation, which identifies distinct parts such as white matter, gray matter, hippocampus, amygdala, and ventricles. Moreover, densely connected network (DenseNet201) feature extractor with sparse autoencoder (SAE) classifier is used for AD detection process. A brief set of simulations is implemented on ADNI dataset to demonstrate the improved performance of the DLMRISC-ADD method, and the outcomes are examined extensively. The experimental results exhibit the effectual segmentation results of the DLMRISC-ADD technique.
{"title":"Deep Learning-Based Magnetic Resonance Image Segmentation and Classification for Alzheimer’s Disease Diagnosis","authors":"Manochandar Thenralmanoharan, P. Kumaraguru Diderot","doi":"10.1142/s0219467825500263","DOIUrl":"https://doi.org/10.1142/s0219467825500263","url":null,"abstract":"Accurate and rapid detection of Alzheimer’s disease (AD) using magnetic resonance imaging (MRI) gained considerable attention among research workers because of an increased number of current researches being driven by deep learning (DL) methods that have accomplished outstanding outcomes in variety of domains involving medical image analysis. Especially, convolution neural network (CNN) is primarily applied for the analyses of image datasets according to the capability of handling massive unstructured datasets and automatically extracting significant features. Earlier detection is dominant to the success and development interferences, and neuroimaging characterizes the potential regions for earlier diagnosis of AD. The study presents and develops a novel Deep Learning-based Magnetic Resonance Image Segmentation and Classification for AD Diagnosis (DLMRISC-ADD) model. The presented DLMRISC-ADD model mainly focuses on the segmentation of MRI images to detect AD. To accomplish this, the presented DLMRISC-ADD model follows a two-stage process, namely, skull stripping and image segmentation. At the preliminary stage, the presented DLMRISC-ADD model employs U-Net-based skull stripping approach to remove skull regions from the input MRIs. Next, in the second stage, the DLMRISC-ADD model applies QuickNAT model for MRI image segmentation, which identifies distinct parts such as white matter, gray matter, hippocampus, amygdala, and ventricles. Moreover, densely connected network (DenseNet201) feature extractor with sparse autoencoder (SAE) classifier is used for AD detection process. A brief set of simulations is implemented on ADNI dataset to demonstrate the improved performance of the DLMRISC-ADD method, and the outcomes are examined extensively. The experimental results exhibit the effectual segmentation results of the DLMRISC-ADD technique.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45697084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-28DOI: 10.1142/s0219467825500251
Shabana Rai, Arif Ullah, Wong Lai Kuan, Rifat Mustafa
When it comes to filtering and compressing data before sending it to a cloud server, fog computing is a rummage sale. Fog computing enables an alternate method to reduce the complexity of medical image processing and steadily improve its dependability. Medical images are produced by imaging processing modalities using X-rays, computed tomography (CT) scans, magnetic resonance imaging (MRI) scans, and ultrasound (US). These medical images are large and have a huge amount of storage. This problem is being solved by making use of compression. In this area, lots of work is done. However, before adding more techniques to Fog, getting a high compression ratio (CR) in a shorter time is required, therefore consuming less network traffic. Le Gall5/3 integer wavelet transform (IWT) and a set partitioning in hierarchical trees (SPIHT) encoder were used in this study’s implementation of an image compression technique. MRI is used in the experiments. The suggested technique uses a modified CR and less compression time (CT) to compress the medical image. The proposed approach results in an average CR of 84.8895%. A 40.92% peak signal-to-noise ratio (PSNR) PNSR value is present. Using the Huffman coding, the proposed approach reduces the CT by 36.7434 s compared to the IWT. Regarding CR, the suggested technique outperforms IWTs with Huffman coding by 12%. The current approach has a 72.36% CR. The suggested work’s shortcoming is that the high CR caused a decline in the quality of the medical images. PSNR values can be raised, and more effort can be made to compress colored medical images and 3-dimensional medical images.
{"title":"An Enhanced Compression Method for Medical Images Using SPIHT Encoder for Fog Computing","authors":"Shabana Rai, Arif Ullah, Wong Lai Kuan, Rifat Mustafa","doi":"10.1142/s0219467825500251","DOIUrl":"https://doi.org/10.1142/s0219467825500251","url":null,"abstract":"When it comes to filtering and compressing data before sending it to a cloud server, fog computing is a rummage sale. Fog computing enables an alternate method to reduce the complexity of medical image processing and steadily improve its dependability. Medical images are produced by imaging processing modalities using X-rays, computed tomography (CT) scans, magnetic resonance imaging (MRI) scans, and ultrasound (US). These medical images are large and have a huge amount of storage. This problem is being solved by making use of compression. In this area, lots of work is done. However, before adding more techniques to Fog, getting a high compression ratio (CR) in a shorter time is required, therefore consuming less network traffic. Le Gall5/3 integer wavelet transform (IWT) and a set partitioning in hierarchical trees (SPIHT) encoder were used in this study’s implementation of an image compression technique. MRI is used in the experiments. The suggested technique uses a modified CR and less compression time (CT) to compress the medical image. The proposed approach results in an average CR of 84.8895%. A 40.92% peak signal-to-noise ratio (PSNR) PNSR value is present. Using the Huffman coding, the proposed approach reduces the CT by 36.7434 s compared to the IWT. Regarding CR, the suggested technique outperforms IWTs with Huffman coding by 12%. The current approach has a 72.36% CR. The suggested work’s shortcoming is that the high CR caused a decline in the quality of the medical images. PSNR values can be raised, and more effort can be made to compress colored medical images and 3-dimensional medical images.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48965801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-22DOI: 10.1142/s0219467825500330
Neha Hajare, A. Rajawat
Black gram crop belongs to the Fabaceae family and its scientific name is Vigna Mungo.It has high nutritional content, improves the fertility of the soil, and provides atmospheric nitrogen fixation in the soil. The quality of the black gram crop is degraded by diseases such as Yellow mosaic, Anthracnose, Powdery Mildew, and Leaf Crinkle which causes economic loss to farmers and degraded production. The agriculture sector needs to classify plant nutrient deficiencies in order to increase crop quality and yield. In order to handle a variety of difficult challenges, computer vision and deep learning technologies play a crucial role in the agricultural and biological sectors. The typical diagnostic procedure involves a pathologist visiting the site and inspecting each plant. However, manually crop disease assessment is limited due to lesser accuracy and limited access of personnel. To address these problems, it is necessary to develop automated methods that can quickly identify and classify a wide range of plant diseases. In this paper, black gram disease classifications are done through a deep ensemble model with optimal training and the procedure of this technique is as follows: Initially, the input dataset is processed to increase its size via data augmentation. Here, the processes like shifting, rotation, and shearing take place. Then, the model starts with the noise removal of images using median filtering. Subsequent to the preprocessing, segmentation takes place via the proposed deep joint segmentation model to determine the ROI and non-ROI regions. The next process is the extraction of the feature set that includes the features like improved multi-texton-based features, shape-based features, color-based features, and local Gabor X-OR pattern features. The model combines the classifiers like Deep Belief Networks, Recurrent Neural Networks, and Convolutional Neural Networks. For tuning the optimal weights of the model, a new algorithm termed swarm intelligence-based Self-Improved Dwarf Mongoose Optimization algorithm (SIDMO) is introduced. Over the past two decades, nature-based metaheuristic algorithms have gained more popularity because of their ability to solve various global optimization problems with optimal solutions. This training model ensures the enhancement of classification accuracy. The accuracy of the SIDMO, which is around 94.82%, is substantially higher than that of the existing models, which are FPA[Formula: see text]88.86%, SSOA[Formula: see text]88.99%, GOA[Formula: see text]85.84%, SMA[Formula: see text]85.11%, SRSR[Formula: see text]85.32%, and DMOA[Formula: see text]88.99%, respectively.
{"title":"Black Gram Disease Classification via Deep Ensemble Model with Optimal Training","authors":"Neha Hajare, A. Rajawat","doi":"10.1142/s0219467825500330","DOIUrl":"https://doi.org/10.1142/s0219467825500330","url":null,"abstract":"Black gram crop belongs to the Fabaceae family and its scientific name is Vigna Mungo.It has high nutritional content, improves the fertility of the soil, and provides atmospheric nitrogen fixation in the soil. The quality of the black gram crop is degraded by diseases such as Yellow mosaic, Anthracnose, Powdery Mildew, and Leaf Crinkle which causes economic loss to farmers and degraded production. The agriculture sector needs to classify plant nutrient deficiencies in order to increase crop quality and yield. In order to handle a variety of difficult challenges, computer vision and deep learning technologies play a crucial role in the agricultural and biological sectors. The typical diagnostic procedure involves a pathologist visiting the site and inspecting each plant. However, manually crop disease assessment is limited due to lesser accuracy and limited access of personnel. To address these problems, it is necessary to develop automated methods that can quickly identify and classify a wide range of plant diseases. In this paper, black gram disease classifications are done through a deep ensemble model with optimal training and the procedure of this technique is as follows: Initially, the input dataset is processed to increase its size via data augmentation. Here, the processes like shifting, rotation, and shearing take place. Then, the model starts with the noise removal of images using median filtering. Subsequent to the preprocessing, segmentation takes place via the proposed deep joint segmentation model to determine the ROI and non-ROI regions. The next process is the extraction of the feature set that includes the features like improved multi-texton-based features, shape-based features, color-based features, and local Gabor X-OR pattern features. The model combines the classifiers like Deep Belief Networks, Recurrent Neural Networks, and Convolutional Neural Networks. For tuning the optimal weights of the model, a new algorithm termed swarm intelligence-based Self-Improved Dwarf Mongoose Optimization algorithm (SIDMO) is introduced. Over the past two decades, nature-based metaheuristic algorithms have gained more popularity because of their ability to solve various global optimization problems with optimal solutions. This training model ensures the enhancement of classification accuracy. The accuracy of the SIDMO, which is around 94.82%, is substantially higher than that of the existing models, which are FPA[Formula: see text]88.86%, SSOA[Formula: see text]88.99%, GOA[Formula: see text]85.84%, SMA[Formula: see text]85.11%, SRSR[Formula: see text]85.32%, and DMOA[Formula: see text]88.99%, respectively.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43941856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-17DOI: 10.1142/s0219467825500354
Zhijing Gao, Weilin Qiu, Ren Wenqi, Xiao Yan
{"title":"Research on robust digital watermarking based on reversible information hiding","authors":"Zhijing Gao, Weilin Qiu, Ren Wenqi, Xiao Yan","doi":"10.1142/s0219467825500354","DOIUrl":"https://doi.org/10.1142/s0219467825500354","url":null,"abstract":"","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49136822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-04DOI: 10.1142/s0219467825500147
Laxmikant Eshwarappa, G. G. Rajput
Currently, the identification of text from video frames and normal scene images has got amplified awareness amongst analysts owing to its diverse challenges and complexities. Owing to a lower resolution, composite backdrop, blurring effect, color, diverse fonts, alternate textual placement among panels of photos and videos, etc., text identification is becoming complicated. This paper suggests a novel method for identifying texts from video with five stages. Initially, “video-to-frame conversion”, is done during pre-processing. Further, text region verification is performed and keyframes are recognized using CNN. Then, improved candidate text block extraction is carried out using MSER. Subsequently, “DCT features, improved distance map features, and constant gradient-based features” are extracted. These characteristics subsequently provided “Long Short-Term Memory (LSTM)” for detection. Finally, OCR is done to recognize the texts in the image. Particularly, the Self-Improved Bald Eagle Search (SI-BESO) algorithm is used to adjust the LSTM weights. Finally, the superiority of the SI-BESO-based technique over many other techniques is demonstrated.
{"title":"Optimal Classification Model for Text Detection and Recognition in Video Frames","authors":"Laxmikant Eshwarappa, G. G. Rajput","doi":"10.1142/s0219467825500147","DOIUrl":"https://doi.org/10.1142/s0219467825500147","url":null,"abstract":"Currently, the identification of text from video frames and normal scene images has got amplified awareness amongst analysts owing to its diverse challenges and complexities. Owing to a lower resolution, composite backdrop, blurring effect, color, diverse fonts, alternate textual placement among panels of photos and videos, etc., text identification is becoming complicated. This paper suggests a novel method for identifying texts from video with five stages. Initially, “video-to-frame conversion”, is done during pre-processing. Further, text region verification is performed and keyframes are recognized using CNN. Then, improved candidate text block extraction is carried out using MSER. Subsequently, “DCT features, improved distance map features, and constant gradient-based features” are extracted. These characteristics subsequently provided “Long Short-Term Memory (LSTM)” for detection. Finally, OCR is done to recognize the texts in the image. Particularly, the Self-Improved Bald Eagle Search (SI-BESO) algorithm is used to adjust the LSTM weights. Finally, the superiority of the SI-BESO-based technique over many other techniques is demonstrated.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48648435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human action recognition in videos is an important task in computer vision with applications in fields such as surveillance, human–computer interaction, and sports analysis. However, it is a challenging task due to the complex background changes and redundancy of long-term video information. In this paper, we propose a novel bi-directional long short-term memory method with attention pooling based on joint motion and difference entropy (JEAP-BiLSTM) to address these challenges. To obtain discriminative features, we introduce a joint entropy map that measures both the entropy of motion and the entropy of change. The Bi-LSTM method is then applied to capture visual and temporal associations in both forward and backward directions, enabling efficient capture of long-term temporal correlation. Furthermore, attention pooling is used to highlight the region of interest and to mitigate the effects of background changes in video information. Experiments on the UCF101 and HMDB51 datasets demonstrate that the proposed JEAP-BiLSTM method achieves recognition rates of 96.4% and 75.2%, respectively, outperforming existing methods. Our proposed method makes significant contributions to the field of human action recognition by effectively capturing both spatial and temporal patterns in videos, addressing background changes, and achieving state-of-the-art performance.
{"title":"A Jeap-BiLSTM Neural Network for Action Recognition","authors":"Lunzheng Tan, Yanfei Liu, Li-min Xia, Shangsheng Chen, Zhanben Zhou","doi":"10.1142/s0219467825500184","DOIUrl":"https://doi.org/10.1142/s0219467825500184","url":null,"abstract":"Human action recognition in videos is an important task in computer vision with applications in fields such as surveillance, human–computer interaction, and sports analysis. However, it is a challenging task due to the complex background changes and redundancy of long-term video information. In this paper, we propose a novel bi-directional long short-term memory method with attention pooling based on joint motion and difference entropy (JEAP-BiLSTM) to address these challenges. To obtain discriminative features, we introduce a joint entropy map that measures both the entropy of motion and the entropy of change. The Bi-LSTM method is then applied to capture visual and temporal associations in both forward and backward directions, enabling efficient capture of long-term temporal correlation. Furthermore, attention pooling is used to highlight the region of interest and to mitigate the effects of background changes in video information. Experiments on the UCF101 and HMDB51 datasets demonstrate that the proposed JEAP-BiLSTM method achieves recognition rates of 96.4% and 75.2%, respectively, outperforming existing methods. Our proposed method makes significant contributions to the field of human action recognition by effectively capturing both spatial and temporal patterns in videos, addressing background changes, and achieving state-of-the-art performance.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46374210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-03DOI: 10.1142/s0219467825500135
Nusrat Fatma, Pawan Singh, M. K. Siddiqui
Epilepsy is an unavoidable major persistent and critical neurological disorder that influences the human brain. Moreover, this is apparently distinguished via its recurrent malicious seizures. A seizure is a phase of synchronous, abnormal innervations of a neuron’s population which might last from seconds to a few minutes. In addition, epileptic seizures are transient occurrences of complete or partial irregular unintentional body movements that combine with consciousness loss. As epileptic seizures rarely occurred in each patient, their effects based on physical communications, social interactions, and patients’ emotions are considered, and treatment and diagnosis are undergone with crucial implications. Therefore, this survey reviews 65 research papers and states an important analysis on various machine-learning approaches adopted in each paper. The analysis of different features considered in each work is also done. This survey offers a comprehensive study on performance attainment in each contribution. Furthermore, the maximum performance attained by the works and the datasets used in each work is also examined. The analysis on features and the simulation tools used in each contribution is examined. At the end, the survey expanded with different research gaps and their problem which is beneficial to the researchers for promoting advanced future works on epileptic seizure detection.
{"title":"Survey on Epileptic Seizure Detection on Varied Machine Learning Algorithms","authors":"Nusrat Fatma, Pawan Singh, M. K. Siddiqui","doi":"10.1142/s0219467825500135","DOIUrl":"https://doi.org/10.1142/s0219467825500135","url":null,"abstract":"Epilepsy is an unavoidable major persistent and critical neurological disorder that influences the human brain. Moreover, this is apparently distinguished via its recurrent malicious seizures. A seizure is a phase of synchronous, abnormal innervations of a neuron’s population which might last from seconds to a few minutes. In addition, epileptic seizures are transient occurrences of complete or partial irregular unintentional body movements that combine with consciousness loss. As epileptic seizures rarely occurred in each patient, their effects based on physical communications, social interactions, and patients’ emotions are considered, and treatment and diagnosis are undergone with crucial implications. Therefore, this survey reviews 65 research papers and states an important analysis on various machine-learning approaches adopted in each paper. The analysis of different features considered in each work is also done. This survey offers a comprehensive study on performance attainment in each contribution. Furthermore, the maximum performance attained by the works and the datasets used in each work is also examined. The analysis on features and the simulation tools used in each contribution is examined. At the end, the survey expanded with different research gaps and their problem which is beneficial to the researchers for promoting advanced future works on epileptic seizure detection.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44591500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-03DOI: 10.1142/s0219467825500196
Kai Pan, Hongyan Chi
As an important form of expression in modern civilization art, printmaking has a rich variety of types and a prominent sense of artistic hierarchy. Therefore, printmaking is highly favored around the world due to its unique artistic characteristics. Classifying print types through image feature elements will improve people’s understanding of print creation. Convolutional neural networks (CNNs) have good application effects in the field of image classification, so CNN is used for printmaking analysis. Considering that the classification effect of the traditional convolutional neural image classification model is easily affected by the activation function, the T-ReLU activation function is introduced. By utilizing adjustable parameters to enhance the soft saturation characteristics of the model and avoid gradient vanishing, a T-ReLU convolutional model is constructed. A better convolutional image classification model is proposed based on the T-ReLU convolutional model, taking into account the issue of subpar multi-level feature fusion in deep convolutional image classification models. Utilize normalization to analyze visual input, an eleven-layer convolutional network with residual units in the convolutional layer, and cascading thinking to fuse convolutional network defects. The performance test results showed that in the data test of different styles of artificial prints, the GT-ReLU model can obtain the best image classification accuracy, and the image classification accuracy rate is 0.978. The GT-ReLU model maintains a classification accuracy above 94.4% in the multi-dataset test classification performance test, which is higher than that of other image classification models. For the use of visual processing technology in the field of classifying prints, the research content provides good reference value.
{"title":"Research on Printmaking Image Classification and Creation Based on Convolutional Neural Network","authors":"Kai Pan, Hongyan Chi","doi":"10.1142/s0219467825500196","DOIUrl":"https://doi.org/10.1142/s0219467825500196","url":null,"abstract":"As an important form of expression in modern civilization art, printmaking has a rich variety of types and a prominent sense of artistic hierarchy. Therefore, printmaking is highly favored around the world due to its unique artistic characteristics. Classifying print types through image feature elements will improve people’s understanding of print creation. Convolutional neural networks (CNNs) have good application effects in the field of image classification, so CNN is used for printmaking analysis. Considering that the classification effect of the traditional convolutional neural image classification model is easily affected by the activation function, the T-ReLU activation function is introduced. By utilizing adjustable parameters to enhance the soft saturation characteristics of the model and avoid gradient vanishing, a T-ReLU convolutional model is constructed. A better convolutional image classification model is proposed based on the T-ReLU convolutional model, taking into account the issue of subpar multi-level feature fusion in deep convolutional image classification models. Utilize normalization to analyze visual input, an eleven-layer convolutional network with residual units in the convolutional layer, and cascading thinking to fuse convolutional network defects. The performance test results showed that in the data test of different styles of artificial prints, the GT-ReLU model can obtain the best image classification accuracy, and the image classification accuracy rate is 0.978. The GT-ReLU model maintains a classification accuracy above 94.4% in the multi-dataset test classification performance test, which is higher than that of other image classification models. For the use of visual processing technology in the field of classifying prints, the research content provides good reference value.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48183056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-03DOI: 10.1142/s0219467825500202
Yuan Liu
A product image recommendation algorithm with transformer model using deep reinforcement learning is proposed. First, the product image recommendation architecture is designed to collect users’ historical product image clicking behaviors through the log information layer. The recommendation strategy layer uses collaborative filtering algorithm to calculate users’ long-term shopping interest and gated recurrent unit to calculate users’ short-term shopping interest, and predicts users’ long-term and short-term interest output based on users’ positive and negative feedback sequences. Second, the prediction results are fed into the transformer model for content planning to make the data format more suitable for subsequent content recommendation. Finally, the planning results of the transformer model are input to Deep Q-Leaning Network to obtain product image recommendation sequences under the learning of this network, and the results are transmitted to the data result layer, and finally presented to users through the presentation layer. The results show that the recommendation results of the proposed algorithm are consistent with the user’s browsing records. The average accuracy of product image recommendation is 97.1%, the maximum recommended time is 1.0[Formula: see text]s, the coverage and satisfaction are high, and the practical application effect is good. It can recommend more suitable products for users and promote the further development of e-commerce.
{"title":"Product Image Recommendation with Transformer Model Using Deep Reinforcement Learning","authors":"Yuan Liu","doi":"10.1142/s0219467825500202","DOIUrl":"https://doi.org/10.1142/s0219467825500202","url":null,"abstract":"A product image recommendation algorithm with transformer model using deep reinforcement learning is proposed. First, the product image recommendation architecture is designed to collect users’ historical product image clicking behaviors through the log information layer. The recommendation strategy layer uses collaborative filtering algorithm to calculate users’ long-term shopping interest and gated recurrent unit to calculate users’ short-term shopping interest, and predicts users’ long-term and short-term interest output based on users’ positive and negative feedback sequences. Second, the prediction results are fed into the transformer model for content planning to make the data format more suitable for subsequent content recommendation. Finally, the planning results of the transformer model are input to Deep Q-Leaning Network to obtain product image recommendation sequences under the learning of this network, and the results are transmitted to the data result layer, and finally presented to users through the presentation layer. The results show that the recommendation results of the proposed algorithm are consistent with the user’s browsing records. The average accuracy of product image recommendation is 97.1%, the maximum recommended time is 1.0[Formula: see text]s, the coverage and satisfaction are high, and the practical application effect is good. It can recommend more suitable products for users and promote the further development of e-commerce.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43365571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-03DOI: 10.1142/s0219467825500159
K. Sameera, P. Swarnalatha
The development of any country is influenced by the growth in the agriculture sector. The prevalence of pests and diseases in plants affects the productivity of any agricultural product. Early diagnosis of the disease can substantially decrease the effort and the fund required for disease management. The Internet of Things (IoT) provides a framework for offering solutions for automatic farming. This paper devises an automated detection technique for foliar disease classification in apple trees using an IoT network. Here, classification is performed using a hybrid classifier, which utilizes the Deep Residual Network (DRN) and Deep [Formula: see text] Network (DQN). A new Adaptive Tunicate Swarm Sine–Cosine Algorithm (TSSCA) is used for modifying the learning parameters as well as the weights of the proposed hybrid classifier. The TSSCA is developed by adaptively changing the navigation foraging behavior of the tunicates obtained from the Tunicate Swarm Algorithm (TSA) in accordance with the Sine–Cosine Algorithm (SCA). The outputs obtained from the Adaptive TSSCA-based DRN and Adaptive TSSCA-based DQN are merged using cosine similarity measure for detecting the foliar disease. The Plant Pathology 2020 — FGVC7 dataset is utilized for the experimental process to determine accuracy, sensitivity, specificity and energy and we achieved the values of 98.36%, 98.58%, 96.32% and 0.413 J, respectively.
{"title":"Optimization with Deep Learning Classifier-Based Foliar Disease Classification in Apple Trees Using IoT Network","authors":"K. Sameera, P. Swarnalatha","doi":"10.1142/s0219467825500159","DOIUrl":"https://doi.org/10.1142/s0219467825500159","url":null,"abstract":"The development of any country is influenced by the growth in the agriculture sector. The prevalence of pests and diseases in plants affects the productivity of any agricultural product. Early diagnosis of the disease can substantially decrease the effort and the fund required for disease management. The Internet of Things (IoT) provides a framework for offering solutions for automatic farming. This paper devises an automated detection technique for foliar disease classification in apple trees using an IoT network. Here, classification is performed using a hybrid classifier, which utilizes the Deep Residual Network (DRN) and Deep [Formula: see text] Network (DQN). A new Adaptive Tunicate Swarm Sine–Cosine Algorithm (TSSCA) is used for modifying the learning parameters as well as the weights of the proposed hybrid classifier. The TSSCA is developed by adaptively changing the navigation foraging behavior of the tunicates obtained from the Tunicate Swarm Algorithm (TSA) in accordance with the Sine–Cosine Algorithm (SCA). The outputs obtained from the Adaptive TSSCA-based DRN and Adaptive TSSCA-based DQN are merged using cosine similarity measure for detecting the foliar disease. The Plant Pathology 2020 — FGVC7 dataset is utilized for the experimental process to determine accuracy, sensitivity, specificity and energy and we achieved the values of 98.36%, 98.58%, 96.32% and 0.413 J, respectively.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45955735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}