Pub Date : 2024-01-03DOI: 10.1142/s0219467825500524
G. Babu, P. A. Khayum
Due to its significant applications in security, the iris recognition process has been considered as the most active research area over the last few decades. In general, the iris recognition framework has been crucially utilized for various security applications because it includes a set of features as well as does not alter its character according to the time. In recent times, emerging deep learning techniques have attained huge success, particularly in the field of the iris recognition framework model. Moreover, in considering the field of iris recognition, there is no possibility for the remarkable capability of the deep learning model as well as to attain superior performance. To handle the issues in the conventional model of iris recognition, a novel heuristic-aided deep learning framework has been implemented for recognizing the iris system. Initially, the required source iris images are gathered from the data sources. It is then followed by the pre-processing stage, where the pre-processed image is obtained. Consequently, the image segmentation process is carried out by Adaptive Deeplabv3+layers, in which the parameters are optimized using the Modified Weighted Flow Direction Algorithm (MWFDA). Finally, the iris recognition is accomplished by hybrid Hybridization of Multiscale Dilated-Assisted Learning (MDAL) that will be composed of both a Convolutional Neural Network (CNN) and a Residual Network (ResNet). To achieve optimal recognition results, the parameters in CNN and ResNet are tuned optimally by using MWFDA. The experimental results are estimated with the help of distinct measures. Contrary to conventional methods, the empirical results prove that the recommended model achieves the desired value to enhance the recognition performance.
{"title":"Design and Implementation of Novel Hybrid and Multiscale- Assisted CNN and ResNet Using Heuristic Advancement of Adaptive Deep Segmentation for Iris Recognition","authors":"G. Babu, P. A. Khayum","doi":"10.1142/s0219467825500524","DOIUrl":"https://doi.org/10.1142/s0219467825500524","url":null,"abstract":"Due to its significant applications in security, the iris recognition process has been considered as the most active research area over the last few decades. In general, the iris recognition framework has been crucially utilized for various security applications because it includes a set of features as well as does not alter its character according to the time. In recent times, emerging deep learning techniques have attained huge success, particularly in the field of the iris recognition framework model. Moreover, in considering the field of iris recognition, there is no possibility for the remarkable capability of the deep learning model as well as to attain superior performance. To handle the issues in the conventional model of iris recognition, a novel heuristic-aided deep learning framework has been implemented for recognizing the iris system. Initially, the required source iris images are gathered from the data sources. It is then followed by the pre-processing stage, where the pre-processed image is obtained. Consequently, the image segmentation process is carried out by Adaptive Deeplabv3+layers, in which the parameters are optimized using the Modified Weighted Flow Direction Algorithm (MWFDA). Finally, the iris recognition is accomplished by hybrid Hybridization of Multiscale Dilated-Assisted Learning (MDAL) that will be composed of both a Convolutional Neural Network (CNN) and a Residual Network (ResNet). To achieve optimal recognition results, the parameters in CNN and ResNet are tuned optimally by using MWFDA. The experimental results are estimated with the help of distinct measures. Contrary to conventional methods, the empirical results prove that the recommended model achieves the desired value to enhance the recognition performance.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"29 20","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139389203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-30DOI: 10.1142/s0219467825500536
B. Samhitha, R. Subhashini
Behavioral monitoring can be used to monitor aquatic ecosystems and water quality over time. Using precise and rapid fish performance detection, fishermen may make educated management decisions on recirculating aquaculture systems while decreasing labor. Sensors and procedures for recognizing fish behavior are often developed and prepared by researchers in big numbers. Deep learning (DL) techniques have revolutionized the capability to automatically analyze videos, which were utilized for behavior analysis, live fish detection, biomass estimation, water quality monitoring, and species classification. The benefit of DL is that it could automatically study the extraction of image features and reveals brilliant performance in identifying sequential actions. This paper focuses on the design of Dwarf Mongoose Optimization with Transfer Learning-based fish behavior classification (DMOTLB-FBC) model. The presented DMOTLB-FBC technique intends to effectively monitor and classify fish behaviors. Initially, the DMOTLB-FBC technique follows Gaussian filtering (GFI) technique for noise removal process. Besides, a transfer learning (TL)-based neural architectural search network (NASNet) model is used to produce a collection of feature vectors. For fish behavior classification, graph convolution network (GCN) model is employed in this work. To improve the fish behavior classification results of the DMOTLB-FBC technique, the DWO algorithm is applied as a hyperparameter optimizer of the GCN model. The experimentation analysis of the DMOTLB-FBC technique is tested on fish video dataset and the widespread comparison study reported the enhancements of the DMOTLB-FBC technique over other recent approaches.
{"title":"Dwarf Mongoose Optimization with Transfer Learning-Based Fish Behavior Classification Model","authors":"B. Samhitha, R. Subhashini","doi":"10.1142/s0219467825500536","DOIUrl":"https://doi.org/10.1142/s0219467825500536","url":null,"abstract":"Behavioral monitoring can be used to monitor aquatic ecosystems and water quality over time. Using precise and rapid fish performance detection, fishermen may make educated management decisions on recirculating aquaculture systems while decreasing labor. Sensors and procedures for recognizing fish behavior are often developed and prepared by researchers in big numbers. Deep learning (DL) techniques have revolutionized the capability to automatically analyze videos, which were utilized for behavior analysis, live fish detection, biomass estimation, water quality monitoring, and species classification. The benefit of DL is that it could automatically study the extraction of image features and reveals brilliant performance in identifying sequential actions. This paper focuses on the design of Dwarf Mongoose Optimization with Transfer Learning-based fish behavior classification (DMOTLB-FBC) model. The presented DMOTLB-FBC technique intends to effectively monitor and classify fish behaviors. Initially, the DMOTLB-FBC technique follows Gaussian filtering (GFI) technique for noise removal process. Besides, a transfer learning (TL)-based neural architectural search network (NASNet) model is used to produce a collection of feature vectors. For fish behavior classification, graph convolution network (GCN) model is employed in this work. To improve the fish behavior classification results of the DMOTLB-FBC technique, the DWO algorithm is applied as a hyperparameter optimizer of the GCN model. The experimentation analysis of the DMOTLB-FBC technique is tested on fish video dataset and the widespread comparison study reported the enhancements of the DMOTLB-FBC technique over other recent approaches.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":" 14","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139141273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-30DOI: 10.1142/s0219467825500512
Mengting Ye, Zhenxue Chen, Yixin Guo, Kaili Yu, Longcheng Liu
Computer vision obtains object and environment information by simulating human visual senses and borrowing human sensory activity. As one of the main tasks of computer vision, image classification can be used not only for face recognition, traffic scene recognition, image retrieval, and automatic photo categorization but also as a theoretical basis for target detection and image segmentation. In this paper, we use the existing CNN architecture network-ConvNeXt. By adapting and modifying the residual connectivity and convolutional structure of the network, we achieve a balance between classification accuracy and inference speed. These modifications are able to reduce both computation and memory consumption while keeping accuracy largely unchanged, thus better facilitating network lightweighting.
{"title":"MRCNet: Multi-Level Residual Connectivity Network for Image Classification","authors":"Mengting Ye, Zhenxue Chen, Yixin Guo, Kaili Yu, Longcheng Liu","doi":"10.1142/s0219467825500512","DOIUrl":"https://doi.org/10.1142/s0219467825500512","url":null,"abstract":"Computer vision obtains object and environment information by simulating human visual senses and borrowing human sensory activity. As one of the main tasks of computer vision, image classification can be used not only for face recognition, traffic scene recognition, image retrieval, and automatic photo categorization but also as a theoretical basis for target detection and image segmentation. In this paper, we use the existing CNN architecture network-ConvNeXt. By adapting and modifying the residual connectivity and convolutional structure of the network, we achieve a balance between classification accuracy and inference speed. These modifications are able to reduce both computation and memory consumption while keeping accuracy largely unchanged, thus better facilitating network lightweighting.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":" 9","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139141851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1142/s0219467824500396
S. Veling, T. B. Mohite-Patil
Global food security can be influenced by the diseases in crop plants as several diseases straightforwardly influence the quality of the grains, vegetables, fruits, etc., which also results in affecting of agricultural productivity. Like other plants, the mango tree is also affected by several diseases, and also the identification of multi-disease classification with a single leaf is more complex, and also it is impossible to detect diseases with bare eyes. Based on the other plants, the mango tree is also affected by various diseases, which is more difficult to detect the disorders with bare eyes. It is error-prone, inconsistent, and unreliable. Here, the mango trees are affected during the production, and also affect the plant health regarding multi-diseases. When the plants are affected by the diseases, it may cause fewer amounts of productivity, as a result, impacting the economy. However, it is more critical to detect plant diseases with the large varieties of trees and plants. Various research tasks on deep learning approaches focus on identifying the diseases in plants including leaves and fruits. Thus, the main objective of this paper is to implement an effective and appropriate technique for diagnosing mango tree diseases and their symptoms through fruit and leaf images, and thus, there is a need for an appropriate system for cost-effective and early solutions to this problem. Hence, the main intention of this work is to implement an efficient and suitable technique for diagnosing mango tree diseases and also identify the symptoms through fruit and leaf images. Intending to overcome the existing challenges, there is a need for an appropriate system for achieving cost-effectiveness and also creating an early solution to resolve this problem. This paper intends to present novel deep learning models for mango tree multi-disease classification. Initially, the data collection is done for gathering the diseased parts of the mango tree in terms of leaf and fruit images. Then, the contrast enhancement of the images is performed by the “Contrast-Limited Adaptive Histogram Equalization (CLAHE)”. For the deep feature extraction of leaf images, and fruit images, Convolutional Neural Network (CNN) is employed, and the features from both inputs are concatenated for further processing. Further, the weighted feature selection is adopted for selecting the most significant features by the Adaptive Squirrel-Grey Wolf Search Optimization (AS-GWSO). Enhanced “Long Short Term Memory (LSTM)” is applied in the classification part with parameter optimization using the same AS-GWSO for enhancing classification accuracy. At last, the results of the designed system on various mango tree diseases verify that the designed approach has yielded the highest accuracy by evaluating conventional approaches. Therefore, it would also alleviate and treat the affected mango leaf diseases accurately.
{"title":"Multi-disease Classification of Mango Tree Using Meta-heuristic-based Weighted Feature Selection and LSTM Model","authors":"S. Veling, T. B. Mohite-Patil","doi":"10.1142/s0219467824500396","DOIUrl":"https://doi.org/10.1142/s0219467824500396","url":null,"abstract":"Global food security can be influenced by the diseases in crop plants as several diseases straightforwardly influence the quality of the grains, vegetables, fruits, etc., which also results in affecting of agricultural productivity. Like other plants, the mango tree is also affected by several diseases, and also the identification of multi-disease classification with a single leaf is more complex, and also it is impossible to detect diseases with bare eyes. Based on the other plants, the mango tree is also affected by various diseases, which is more difficult to detect the disorders with bare eyes. It is error-prone, inconsistent, and unreliable. Here, the mango trees are affected during the production, and also affect the plant health regarding multi-diseases. When the plants are affected by the diseases, it may cause fewer amounts of productivity, as a result, impacting the economy. However, it is more critical to detect plant diseases with the large varieties of trees and plants. Various research tasks on deep learning approaches focus on identifying the diseases in plants including leaves and fruits. Thus, the main objective of this paper is to implement an effective and appropriate technique for diagnosing mango tree diseases and their symptoms through fruit and leaf images, and thus, there is a need for an appropriate system for cost-effective and early solutions to this problem. Hence, the main intention of this work is to implement an efficient and suitable technique for diagnosing mango tree diseases and also identify the symptoms through fruit and leaf images. Intending to overcome the existing challenges, there is a need for an appropriate system for achieving cost-effectiveness and also creating an early solution to resolve this problem. This paper intends to present novel deep learning models for mango tree multi-disease classification. Initially, the data collection is done for gathering the diseased parts of the mango tree in terms of leaf and fruit images. Then, the contrast enhancement of the images is performed by the “Contrast-Limited Adaptive Histogram Equalization (CLAHE)”. For the deep feature extraction of leaf images, and fruit images, Convolutional Neural Network (CNN) is employed, and the features from both inputs are concatenated for further processing. Further, the weighted feature selection is adopted for selecting the most significant features by the Adaptive Squirrel-Grey Wolf Search Optimization (AS-GWSO). Enhanced “Long Short Term Memory (LSTM)” is applied in the classification part with parameter optimization using the same AS-GWSO for enhancing classification accuracy. At last, the results of the designed system on various mango tree diseases verify that the designed approach has yielded the highest accuracy by evaluating conventional approaches. Therefore, it would also alleviate and treat the affected mango leaf diseases accurately.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"78 S19","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139151925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1142/s0219467825500482
Yawen Tang, Jianhong Ren
The continuous development of virtual reality animation has brought people a new viewing experience. However, there is still a large research space for the construction of virtual scenes. Underwater scenes are complex and diverse, and to obtain more realistic virtual scenes, it is necessary to use video panoramic images as reference modeling in advance. To this end, the study uses the [Formula: see text]-means clustering method to extract key frames from underwater video, and adaptively adjusts the number of clusters to improve the extraction algorithm according to the differences in features. To address the problems of low contrast and severe blurring in underwater images, the study uses an improved non-local a priori recovery method to achieve the recovery process of underwater images. Finally, the final underwater panoramic image is obtained by fading-out image fusion and frame to stitching image synthesis strategy. The experimental analysis shows that the runtime of Model 1 is 21.46[Formula: see text]s, the root mean square error value is 1.89, the structural similarity value is 0.9678, and the average gradient value is 12.59. It can achieve efficient and high-quality panoramic image generation.
{"title":"Feature Matching-Based Undersea Panoramic Image Stitching in VR Animation","authors":"Yawen Tang, Jianhong Ren","doi":"10.1142/s0219467825500482","DOIUrl":"https://doi.org/10.1142/s0219467825500482","url":null,"abstract":"The continuous development of virtual reality animation has brought people a new viewing experience. However, there is still a large research space for the construction of virtual scenes. Underwater scenes are complex and diverse, and to obtain more realistic virtual scenes, it is necessary to use video panoramic images as reference modeling in advance. To this end, the study uses the [Formula: see text]-means clustering method to extract key frames from underwater video, and adaptively adjusts the number of clusters to improve the extraction algorithm according to the differences in features. To address the problems of low contrast and severe blurring in underwater images, the study uses an improved non-local a priori recovery method to achieve the recovery process of underwater images. Finally, the final underwater panoramic image is obtained by fading-out image fusion and frame to stitching image synthesis strategy. The experimental analysis shows that the runtime of Model 1 is 21.46[Formula: see text]s, the root mean square error value is 1.89, the structural similarity value is 0.9678, and the average gradient value is 12.59. It can achieve efficient and high-quality panoramic image generation.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"49 7","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139151142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-20DOI: 10.1142/s0219467825500470
Santosh Kumar Tripathy, Poonkuntran Shanmugam
Crowd behavior prediction (CBP) and crowd counting (CC) are the essential functions of vision-based crowd analysis (CA), which play a crucial role in controlling crowd disasters. The CA using different models for the CBP and the CC will increase computational overheads and have synchronization issues. The state-of-the-art approaches utilized deep convolutional architectures to exploit spatial-temporal features to accomplish the objective, but such models suffer from computational complexities during convolution operations. Thus, to sort out the issues as mentioned earlier, this paper develops a single deep model which performs two functionalities of CA: CBP and CC. The proposed model uses multilayers of depth-wise separable CNN (DSCNN) to extract fine-grained spatial-temporal features from the scene. The DSCNN can minimize the number of matrix multiplications during convolution operation compared to traditional CNN. Further, the existing datasets are available to accomplish the single functionality of CA. In contrast, the proposed model needs a dual-tasking CA dataset which should provide the ground-truth labels for CBP and CC. Thus, a dual functionality CA dataset is prepared using a benchmark crowd behavior dataset, i.e. MED. Around 41[Formula: see text]000 frames have been manually annotated to obtain ground-truth crowd count values. This paper also demonstrates an experiment on the proposed multi-functional dataset and outperforms the state-of-the-art methods regarding several performance metrics. In addition, the proposed model processes each test frame at 3.40 milliseconds, and thus is easily applicable in real-time.
人群行为预测(CBP)和人群计数(CC)是基于视觉的人群分析(CA)的基本功能,在控制人群灾难中发挥着至关重要的作用。对 CBP 和 CC 使用不同模型的 CA 会增加计算开销并产生同步问题。最先进的方法利用深度卷积架构来利用时空特征来实现目标,但这类模型在卷积操作过程中存在计算复杂性问题。因此,为了解决前面提到的问题,本文开发了一种单一的深度模型,可实现 CA 的两种功能:CBP 和 CC。所提出的模型使用多层深度可分离 CNN(DSCNN)从场景中提取细粒度时空特征。与传统的 CNN 相比,DSCNN 可以最大限度地减少卷积操作中的矩阵乘法次数。此外,现有的数据集可以实现 CA 的单一功能。相比之下,所提出的模型需要一个双任务 CA 数据集,为 CBP 和 CC 提供地面真实标签。因此,我们使用基准人群行为数据集(即 MED)准备了一个双功能 CA 数据集。约 41[公式:见正文]000帧图像已被人工标注,以获得真实的人群数量值。本文还对所提出的多功能数据集进行了实验演示,在多个性能指标上都优于最先进的方法。此外,所提出的模型处理每个测试帧的时间仅为 3.40 毫秒,因此易于实时应用。
{"title":"Real-Time Spatial-Temporal Depth Separable CNN for Multi-Functional Crowd Analysis in Videos","authors":"Santosh Kumar Tripathy, Poonkuntran Shanmugam","doi":"10.1142/s0219467825500470","DOIUrl":"https://doi.org/10.1142/s0219467825500470","url":null,"abstract":"Crowd behavior prediction (CBP) and crowd counting (CC) are the essential functions of vision-based crowd analysis (CA), which play a crucial role in controlling crowd disasters. The CA using different models for the CBP and the CC will increase computational overheads and have synchronization issues. The state-of-the-art approaches utilized deep convolutional architectures to exploit spatial-temporal features to accomplish the objective, but such models suffer from computational complexities during convolution operations. Thus, to sort out the issues as mentioned earlier, this paper develops a single deep model which performs two functionalities of CA: CBP and CC. The proposed model uses multilayers of depth-wise separable CNN (DSCNN) to extract fine-grained spatial-temporal features from the scene. The DSCNN can minimize the number of matrix multiplications during convolution operation compared to traditional CNN. Further, the existing datasets are available to accomplish the single functionality of CA. In contrast, the proposed model needs a dual-tasking CA dataset which should provide the ground-truth labels for CBP and CC. Thus, a dual functionality CA dataset is prepared using a benchmark crowd behavior dataset, i.e. MED. Around 41[Formula: see text]000 frames have been manually annotated to obtain ground-truth crowd count values. This paper also demonstrates an experiment on the proposed multi-functional dataset and outperforms the state-of-the-art methods regarding several performance metrics. In addition, the proposed model processes each test frame at 3.40 milliseconds, and thus is easily applicable in real-time.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"61 3","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139255084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-18DOI: 10.1142/s021946782550038x
Samrajam Jyothula, S. Chandrasekhar
Land cover (LC) categorization is considered a necessary task of intelligent interpretation technology for remote sensing imagery that is intended to categorize every pixel to perform the predefined LC classification. Land Use and Land Cover (LULC) information has the ability to provide various insights in order to overcome environmental and socioeconomic impacts such as disaster risk, climate change, poverty, and food insecurity. Therefore, image categorization tasks are involved in conventional works, where the classical visual interpretation techniques completely depend upon professional knowledge as well as a professional’s classification experience, which is more susceptible to subjective awareness, inefficient, and time consuming. By overcoming this issue, the latest deep-structured approach is suggested to perform the LC image classification. Initially, the land images are gathered. Further, the collected images are employed for patch splitting, where the images are split into multiple patches. After splitting, the patches are fed to the Ensemble-based Convolutional Neural Network (ECNN), which is constructed with a Fully Convolutional Network (FCN), U-Net, DeepLabv3, and Mask Region-based Convolutional Neural Network (Mask R-CNN) for performing segmentation. Here, the hyperparameters are optimally tuned with the Hybrid Billiards-inspired Water Wave Algorithm (HB-WWA) by integrating the Billiards-inspired Optimization Algorithm (BOA) and Water Wave Algorithm (WWA). Finally, the classification is carried out with a fuzzy classifier. Thus, the performance is validated and measured through diverse metrics. Consequently, the developed work has demonstrated enhanced classification accuracy when tested on other existing algorithms.
{"title":"CNN-LandCoverNet: An Effective Framework of Land Cover Classification Using Hybrid Metaheuristic-Aided Ensemble-Based Convolutional Neural Network","authors":"Samrajam Jyothula, S. Chandrasekhar","doi":"10.1142/s021946782550038x","DOIUrl":"https://doi.org/10.1142/s021946782550038x","url":null,"abstract":"Land cover (LC) categorization is considered a necessary task of intelligent interpretation technology for remote sensing imagery that is intended to categorize every pixel to perform the predefined LC classification. Land Use and Land Cover (LULC) information has the ability to provide various insights in order to overcome environmental and socioeconomic impacts such as disaster risk, climate change, poverty, and food insecurity. Therefore, image categorization tasks are involved in conventional works, where the classical visual interpretation techniques completely depend upon professional knowledge as well as a professional’s classification experience, which is more susceptible to subjective awareness, inefficient, and time consuming. By overcoming this issue, the latest deep-structured approach is suggested to perform the LC image classification. Initially, the land images are gathered. Further, the collected images are employed for patch splitting, where the images are split into multiple patches. After splitting, the patches are fed to the Ensemble-based Convolutional Neural Network (ECNN), which is constructed with a Fully Convolutional Network (FCN), U-Net, DeepLabv3, and Mask Region-based Convolutional Neural Network (Mask R-CNN) for performing segmentation. Here, the hyperparameters are optimally tuned with the Hybrid Billiards-inspired Water Wave Algorithm (HB-WWA) by integrating the Billiards-inspired Optimization Algorithm (BOA) and Water Wave Algorithm (WWA). Finally, the classification is carried out with a fuzzy classifier. Thus, the performance is validated and measured through diverse metrics. Consequently, the developed work has demonstrated enhanced classification accuracy when tested on other existing algorithms.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"12 6","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139261105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The benefits of using an automatic dietary assessment system for accompanying diabetes patients and prediabetic persons to control the risk factor also referred to as the obesity “pandemic” are now widely proven and accepted. However, there is no universal solution as eating habits of people are dependent on context and culture. This project is the cornerstone for future works of researchers and health professionals in the field of automatic dietary assessment of Mauritian dishes. We propose a process to produce a food dataset for Mauritian dishes using the Generative Adversarial Network (GAN) and a fine Convolutional Neural Network (CNN) model for identifying Mauritian food dishes. The outputs and findings of this research can be used in the process of automatic calorie calculation and food recommendation, primarily using ubiquitous devices like mobile phones via mobile applications. Using the Adam optimizer with carefully fixed hyper-parameters, we achieved an Accuracy of 95.66% and Loss of 3.5% as concerns the recognition task.
{"title":"DLMDish: Using Applied Deep Learning and Computer Vision to Automatically Classify Mauritian Dishes","authors":"Mohammud Shaad Ally Toofanee, Omar Boudraa, Karim Tamine","doi":"10.1142/s0219467825500457","DOIUrl":"https://doi.org/10.1142/s0219467825500457","url":null,"abstract":"The benefits of using an automatic dietary assessment system for accompanying diabetes patients and prediabetic persons to control the risk factor also referred to as the obesity “pandemic” are now widely proven and accepted. However, there is no universal solution as eating habits of people are dependent on context and culture. This project is the cornerstone for future works of researchers and health professionals in the field of automatic dietary assessment of Mauritian dishes. We propose a process to produce a food dataset for Mauritian dishes using the Generative Adversarial Network (GAN) and a fine Convolutional Neural Network (CNN) model for identifying Mauritian food dishes. The outputs and findings of this research can be used in the process of automatic calorie calculation and food recommendation, primarily using ubiquitous devices like mobile phones via mobile applications. Using the Adam optimizer with carefully fixed hyper-parameters, we achieved an Accuracy of 95.66% and Loss of 3.5% as concerns the recognition task.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"202 4","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139262226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1142/s0219467825500469
N. P. Jayasri, R. Aruna
In the past decades, there is a wide increase in the number of people affected by diabetes, a chronic illness. Early prediction of diabetes is still a challenging problem as it requires clear and sound datasets for a precise prediction. In this era of ubiquitous information technology, big data helps to collect a large amount of information regarding healthcare systems. Due to explosion in the generation of digital data, selecting appropriate data for analysis still remains a complex task. Moreover, missing values and insignificantly labeled data restrict the prediction accuracy. In this context, with the aim of improving the quality of the dataset, missing values are effectively handled by three major phases such as (1) pre-processing, (2) feature extraction, and (3) classification. Pre-processing involves outlier rejection and filling missing values. Feature extraction is done by a principal component analysis (PCA) and finally, the precise prediction of diabetes is accomplished by implementing an effective distance adaptive-KNN (DA-KNN) classifier. The experiments were conducted using Pima Indian Diabetes (PID) dataset and the performance of the proposed model was compared with the state-of-the-art models. The analysis after implementation shows that the proposed model outperforms the conventional models such as NB, SVM, KNN, and RF in terms of accuracy and ROC.
{"title":"A Novel Diabetes Prediction Model in Big Data Healthcare Systems Using DA-KNN Technique","authors":"N. P. Jayasri, R. Aruna","doi":"10.1142/s0219467825500469","DOIUrl":"https://doi.org/10.1142/s0219467825500469","url":null,"abstract":"In the past decades, there is a wide increase in the number of people affected by diabetes, a chronic illness. Early prediction of diabetes is still a challenging problem as it requires clear and sound datasets for a precise prediction. In this era of ubiquitous information technology, big data helps to collect a large amount of information regarding healthcare systems. Due to explosion in the generation of digital data, selecting appropriate data for analysis still remains a complex task. Moreover, missing values and insignificantly labeled data restrict the prediction accuracy. In this context, with the aim of improving the quality of the dataset, missing values are effectively handled by three major phases such as (1) pre-processing, (2) feature extraction, and (3) classification. Pre-processing involves outlier rejection and filling missing values. Feature extraction is done by a principal component analysis (PCA) and finally, the precise prediction of diabetes is accomplished by implementing an effective distance adaptive-KNN (DA-KNN) classifier. The experiments were conducted using Pima Indian Diabetes (PID) dataset and the performance of the proposed model was compared with the state-of-the-art models. The analysis after implementation shows that the proposed model outperforms the conventional models such as NB, SVM, KNN, and RF in terms of accuracy and ROC.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"28 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1142/s0219467825500421
Peng Zhang, Yangyang Miao, Dongri Shan, Shuang Li
In the 2D–3D registration process, due to the differences in CAD model sizes, models may be too large to be displayed in full or too small to have obvious features. To address these problems, previous studies have attempted to adjust parameters manually; however, this is imprecise and frequently requires multiple adjustments. Thus, in this paper, we propose the model self-adaptive display of fixed-distance and maximization (MSDFM) algorithm. The uncertainty of the model display affects the storage costs of pose images, and pose images themselves occupy a large amount of storage space; thus, we also propose the storage optimization based on the region of interest (SOBROI) method to reduce storage costs. The proposed MSDFM algorithm retrieves the farthest point of the model and then searches for the maximum pose image of the model display through the farthest point. The algorithm then changes the projection angle until the maximum pose image is maximized within the window. The pose images are then cropped by the proposed SOBROI method to reduce storage costs. By labeling the connected domains in the binary pose image, an external rectangle of the largest connected domain is applied to crop the pose image, which is then saved in the lossless compression portable network image (PNG) format. Experimental results demonstrate that the proposed MSDFM algorithm can automatically adjust models of different sizes. In addition, the results show that the proposed SOBROI method reduces the storage space of pose libraries by at least 89.66% and at most 99.86%.
{"title":"Model Self-Adaptive Display for 2D–3D Registration","authors":"Peng Zhang, Yangyang Miao, Dongri Shan, Shuang Li","doi":"10.1142/s0219467825500421","DOIUrl":"https://doi.org/10.1142/s0219467825500421","url":null,"abstract":"In the 2D–3D registration process, due to the differences in CAD model sizes, models may be too large to be displayed in full or too small to have obvious features. To address these problems, previous studies have attempted to adjust parameters manually; however, this is imprecise and frequently requires multiple adjustments. Thus, in this paper, we propose the model self-adaptive display of fixed-distance and maximization (MSDFM) algorithm. The uncertainty of the model display affects the storage costs of pose images, and pose images themselves occupy a large amount of storage space; thus, we also propose the storage optimization based on the region of interest (SOBROI) method to reduce storage costs. The proposed MSDFM algorithm retrieves the farthest point of the model and then searches for the maximum pose image of the model display through the farthest point. The algorithm then changes the projection angle until the maximum pose image is maximized within the window. The pose images are then cropped by the proposed SOBROI method to reduce storage costs. By labeling the connected domains in the binary pose image, an external rectangle of the largest connected domain is applied to crop the pose image, which is then saved in the lossless compression portable network image (PNG) format. Experimental results demonstrate that the proposed MSDFM algorithm can automatically adjust models of different sizes. In addition, the results show that the proposed SOBROI method reduces the storage space of pose libraries by at least 89.66% and at most 99.86%.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"28 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}