Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.414-427
Sachin Gaur, Navneet Tripathi, Jyoti Pandey
In the digital age, protecting the ownership and data veracity of digital documents is a major challenge. To address the issues concerning copyright protection and data verification of digital media, digital watermarking has emerged as a solution. In this paper, we aspire to make a modest contribution to this emerging and exciting field by presenting our proposed adaptive hybrid image watermarking approach that combines Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD). Our method involves applying DWT to both the host image and watermark, followed by singular decomposition using SVD on the Low-Low (LL) component of both images. Now modify the singular values of the host image by the singular values of the watermark, and then inverse SVD is applied, followed by inverse DWT, to obtain the watermarked image. After that, the reverse process is applied to obtain the watermark image. Finally, we evaluate our approach’s performance by measuring the Peak Signal-to-Noise Ratio (PSNR) between the original and watermarked image as well as the Normalized Cross-Correlation (NCC) between the original and extracted watermark. Simulation results indicate that the proposed method is rich in terms of robustness, imperceptibility and capacity than the previously presented schemes.
{"title":"A Hybrid DWT-SVD Based Adaptive Image Watermarking Scheme","authors":"Sachin Gaur, Navneet Tripathi, Jyoti Pandey","doi":"10.18178/joig.11.4.414-427","DOIUrl":"https://doi.org/10.18178/joig.11.4.414-427","url":null,"abstract":"In the digital age, protecting the ownership and data veracity of digital documents is a major challenge. To address the issues concerning copyright protection and data verification of digital media, digital watermarking has emerged as a solution. In this paper, we aspire to make a modest contribution to this emerging and exciting field by presenting our proposed adaptive hybrid image watermarking approach that combines Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD). Our method involves applying DWT to both the host image and watermark, followed by singular decomposition using SVD on the Low-Low (LL) component of both images. Now modify the singular values of the host image by the singular values of the watermark, and then inverse SVD is applied, followed by inverse DWT, to obtain the watermarked image. After that, the reverse process is applied to obtain the watermark image. Finally, we evaluate our approach’s performance by measuring the Peak Signal-to-Noise Ratio (PSNR) between the original and watermarked image as well as the Normalized Cross-Correlation (NCC) between the original and extracted watermark. Simulation results indicate that the proposed method is rich in terms of robustness, imperceptibility and capacity than the previously presented schemes.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138620978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.405-413
Hassnae Remmach, Raja Mouachi, M. Sadgal, Aziz El Fazziki
The use of 3D reconstruction in computer vision applications has opened up new avenues for research and development. It has a significant impact on a range of industries, from healthcare to robotics, by improving the performance and abilities of computer vision systems. In this paper we aim to improve 3D reconstruction quality and accuracy. The objective is to develop a model that can learn to extract features, estimate a Supershape parameters and reconstruct 3D directly from input points cloud. In this regard, we present a continuity of our latest works, using a CNN-based Multi-Output and Multi-Task Regressor, for 3D reconstruction from 3D point cloud. We propose another new approach in order to refine our previous methodology and expand our findings. It is about “Reg-PointNet++”, which is mainly based on a PointNet++ architecture adapted for multi-task regression, with the goal of reconstructing a 3D object modeled by Supershapes from 3D point cloud. Given the difficulties encountered in applying convolution to point clouds, our approach is based on the PointNet ++ architecture. It is used to extract features from the 3D point cloud, which are then fed into a Multi-task Regressor for predicting the Supershape parameters needed to reconstruct the shape. The approach has shown promising results in reconstructing 3D objects modeled by Supershapes, demonstrating improved accuracy and robustness to noise and outperforming existing techniques. Visually, the predicted shapes have a high likelihood with the real shapes, as well as a high accuracy rate in a very reasonable number of iterations. Overall, the approach presented in the paper has the potential to significantly improve the accuracy and efficiency of 3D reconstruction, enabling its use in a wider range of applications.
{"title":"Reg-PointNet++: A CNN Network Based on PointNet++ Architecture for 3D Reconstruction of 3D Objects Modeled by Supershapes","authors":"Hassnae Remmach, Raja Mouachi, M. Sadgal, Aziz El Fazziki","doi":"10.18178/joig.11.4.405-413","DOIUrl":"https://doi.org/10.18178/joig.11.4.405-413","url":null,"abstract":"The use of 3D reconstruction in computer vision applications has opened up new avenues for research and development. It has a significant impact on a range of industries, from healthcare to robotics, by improving the performance and abilities of computer vision systems. In this paper we aim to improve 3D reconstruction quality and accuracy. The objective is to develop a model that can learn to extract features, estimate a Supershape parameters and reconstruct 3D directly from input points cloud. In this regard, we present a continuity of our latest works, using a CNN-based Multi-Output and Multi-Task Regressor, for 3D reconstruction from 3D point cloud. We propose another new approach in order to refine our previous methodology and expand our findings. It is about “Reg-PointNet++”, which is mainly based on a PointNet++ architecture adapted for multi-task regression, with the goal of reconstructing a 3D object modeled by Supershapes from 3D point cloud. Given the difficulties encountered in applying convolution to point clouds, our approach is based on the PointNet ++ architecture. It is used to extract features from the 3D point cloud, which are then fed into a Multi-task Regressor for predicting the Supershape parameters needed to reconstruct the shape. The approach has shown promising results in reconstructing 3D objects modeled by Supershapes, demonstrating improved accuracy and robustness to noise and outperforming existing techniques. Visually, the predicted shapes have a high likelihood with the real shapes, as well as a high accuracy rate in a very reasonable number of iterations. Overall, the approach presented in the paper has the potential to significantly improve the accuracy and efficiency of 3D reconstruction, enabling its use in a wider range of applications.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"60 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138627535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.330-342
Ade Bastian, Adie Iman Nurzaman, Tri Ferga Prasetyo, Sri Fatimah
Roselle is a fiber-producing plant that has broad benefits for health food, so many farmers are interested in starting to cultivate it. This study aims to design a rosella plant pest detection system to reduce the risk of crop failure or reduced yields of rosella calyx. The design of a system for detecting and classifying rosella pests uses the threshold method as a digital image processing method connected via the internet with information media applications and template matching to detect and classify pests on rosella plants. Detection of pests on rosella plants has been successfully built using a detection system using thresholding and template matching methods. Datasets of rosella plant pests that are not yet widely available encourage the detection of rosella plant pests with datasets from rosella plant objects and limited data testing. Testing with 75% accuracy, the detection process is affected by light and camera quality.
{"title":"Roselle Pest Detection and Classification Using Threshold and Template Matching","authors":"Ade Bastian, Adie Iman Nurzaman, Tri Ferga Prasetyo, Sri Fatimah","doi":"10.18178/joig.11.4.330-342","DOIUrl":"https://doi.org/10.18178/joig.11.4.330-342","url":null,"abstract":"Roselle is a fiber-producing plant that has broad benefits for health food, so many farmers are interested in starting to cultivate it. This study aims to design a rosella plant pest detection system to reduce the risk of crop failure or reduced yields of rosella calyx. The design of a system for detecting and classifying rosella pests uses the threshold method as a digital image processing method connected via the internet with information media applications and template matching to detect and classify pests on rosella plants. Detection of pests on rosella plants has been successfully built using a detection system using thresholding and template matching methods. Datasets of rosella plant pests that are not yet widely available encourage the detection of rosella plant pests with datasets from rosella plant objects and limited data testing. Testing with 75% accuracy, the detection process is affected by light and camera quality.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138612885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As IoT and cloud computing have grown in popularity, medical images are now often transmitted between devices or accessed directly from the cloud. With this, the security is always a concern as these images are prone to many types of attack. We have proposed a proven method that is efficient in terms of security, time complexity, and integrity in order to be cloud-friendly so that it may be launched into the cloud and made accessible to users at any time. The goal of the work is to create a dynamic key that, depending on fuzzy values, alters the reproduction rate parameters with each repetition. By applying the last chaotic value created from the previous iteration, the fuzzy triangular membership function has been used in this manner to generate the reproduction rate parameter. The uniqueness and major benefit of the suggested strategy are that it can increase the security of the algorithm that makes use of a chaotic map and a static key. The method has been put forth when designing algorithms so that it should not only demonstrate security against different attacks but also provide efficiency towards computational complexity. The technique has been tested against a set of images and an existing algorithm using a variety of security metrics, including the correlation coefficient, Number of Pixel Change Rate (NPCR), Unified Average Changing Intensity (UACI), and entropy. It has been determined from the comparative analysis that the proposed approach can make the existing algorithm more secure.
{"title":"An Enhanced Security in Medical Image Encryption Using Dynamic Chaotic Fuzzy Based Technique","authors":"Snehashish Bhattacharjee, Mousumi Gupta, Biswajoy Chatterjee","doi":"10.18178/joig.11.4.376-383","DOIUrl":"https://doi.org/10.18178/joig.11.4.376-383","url":null,"abstract":"As IoT and cloud computing have grown in popularity, medical images are now often transmitted between devices or accessed directly from the cloud. With this, the security is always a concern as these images are prone to many types of attack. We have proposed a proven method that is efficient in terms of security, time complexity, and integrity in order to be cloud-friendly so that it may be launched into the cloud and made accessible to users at any time. The goal of the work is to create a dynamic key that, depending on fuzzy values, alters the reproduction rate parameters with each repetition. By applying the last chaotic value created from the previous iteration, the fuzzy triangular membership function has been used in this manner to generate the reproduction rate parameter. The uniqueness and major benefit of the suggested strategy are that it can increase the security of the algorithm that makes use of a chaotic map and a static key. The method has been put forth when designing algorithms so that it should not only demonstrate security against different attacks but also provide efficiency towards computational complexity. The technique has been tested against a set of images and an existing algorithm using a variety of security metrics, including the correlation coefficient, Number of Pixel Change Rate (NPCR), Unified Average Changing Intensity (UACI), and entropy. It has been determined from the comparative analysis that the proposed approach can make the existing algorithm more secure.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"121 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138615226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.353-358
Radhwan M. W. Khaleel, N. M. Basheer
Skin cancer has become the fifth-most dangerous type of cancer. Melanoma, the most ferocious type of skin cancer, should be detected and treated to reduce the risk of spreading to the rest of the body’s organs. This study aims to provide fast and painless detection of skin cancer using image processing, including enhancement and extraction of interesting features for the characterization and classification of infected skin images into melanoma or nonmelanoma in MATLAB. The features used for texture analysis of inserted images are the Gray Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP). The classification of melanoma and non-melanoma is done by training a Support Vector Machine (SVM) using the radial basis function kernel. The accuracy of testing is 94.87%.
{"title":"Melanoma Detection Based on SVM Using MATLAB","authors":"Radhwan M. W. Khaleel, N. M. Basheer","doi":"10.18178/joig.11.4.353-358","DOIUrl":"https://doi.org/10.18178/joig.11.4.353-358","url":null,"abstract":"Skin cancer has become the fifth-most dangerous type of cancer. Melanoma, the most ferocious type of skin cancer, should be detected and treated to reduce the risk of spreading to the rest of the body’s organs. This study aims to provide fast and painless detection of skin cancer using image processing, including enhancement and extraction of interesting features for the characterization and classification of infected skin images into melanoma or nonmelanoma in MATLAB. The features used for texture analysis of inserted images are the Gray Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP). The classification of melanoma and non-melanoma is done by training a Support Vector Machine (SVM) using the radial basis function kernel. The accuracy of testing is 94.87%.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 65","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138614083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.391-396
Nguyen Minh Trieu, Nguyen Truong Thinh
The grading of mango is still a manual process in agriculture. Nowadays, mangoes are classified based on human experience, which makes the grade not uniform for agricultural product export establishments. Therefore, the automated grading of mango is very important to solve these problems. In this study, a random forest algorithm is proposed for an automated mango grading system based on quality attributes such as density, surface defect, and weight. The internal features including dimensions and surface defects are extracted via the captured image. These features are combined with the weight to estimate density. This study uses 732 mangoes that are collected from several local farms. The experiment of the grading system has high accuracy with 98.3%. Instead of using Non-Destructive Testing (NDT) equipment, this grading method can be used to apply to evaluate the quality of other tropical fruits.
{"title":"Using Random Forest Algorithm to Grading Mango's Quality Based on External Features Extracted from Captured Images","authors":"Nguyen Minh Trieu, Nguyen Truong Thinh","doi":"10.18178/joig.11.4.391-396","DOIUrl":"https://doi.org/10.18178/joig.11.4.391-396","url":null,"abstract":"The grading of mango is still a manual process in agriculture. Nowadays, mangoes are classified based on human experience, which makes the grade not uniform for agricultural product export establishments. Therefore, the automated grading of mango is very important to solve these problems. In this study, a random forest algorithm is proposed for an automated mango grading system based on quality attributes such as density, surface defect, and weight. The internal features including dimensions and surface defects are extracted via the captured image. These features are combined with the weight to estimate density. This study uses 732 mangoes that are collected from several local farms. The experiment of the grading system has high accuracy with 98.3%. Instead of using Non-Destructive Testing (NDT) equipment, this grading method can be used to apply to evaluate the quality of other tropical fruits.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138626163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.309-320
Amine Mansouri, Toufik Bakir, S. Femmam
Skeleton-based human action recognition conveys interesting information about the dynamics of a human body. In this work, we develop a method that uses a multi-stream model with connections between the parallel streams. This work is inspired by a state-of-the-art method called FUSIONCPA that merges different modalities: infrared input and skeleton input. Because we are interested in investigating improvements related to the skeleton-branch backbone, we used the Spatial-Temporal Graph Convolutional Networks (ST-GCN) model and an EfficientGCN attention module. We aim to provide improvements when capturing spatial and temporal features. In addition, we exploited a Graph Convolutional Network (GCN) implemented in the ST-GCN model to capture the graphic connectivity in skeletons. This paper reports interesting accuracy on a large-scale dataset (NTU-RGB+D 60), over 91% and 93% on respectively crosssubject, and cross-view benchmarks. This proposed model is lighter by 9 million training parameters compared with the model FUSION-CPA.
{"title":"Human Action Recognition with Skeleton and Infrared Fusion Model","authors":"Amine Mansouri, Toufik Bakir, S. Femmam","doi":"10.18178/joig.11.4.309-320","DOIUrl":"https://doi.org/10.18178/joig.11.4.309-320","url":null,"abstract":"Skeleton-based human action recognition conveys interesting information about the dynamics of a human body. In this work, we develop a method that uses a multi-stream model with connections between the parallel streams. This work is inspired by a state-of-the-art method called FUSIONCPA that merges different modalities: infrared input and skeleton input. Because we are interested in investigating improvements related to the skeleton-branch backbone, we used the Spatial-Temporal Graph Convolutional Networks (ST-GCN) model and an EfficientGCN attention module. We aim to provide improvements when capturing spatial and temporal features. In addition, we exploited a Graph Convolutional Network (GCN) implemented in the ST-GCN model to capture the graphic connectivity in skeletons. This paper reports interesting accuracy on a large-scale dataset (NTU-RGB+D 60), over 91% and 93% on respectively crosssubject, and cross-view benchmarks. This proposed model is lighter by 9 million training parameters compared with the model FUSION-CPA.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 89","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138613767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the 2021 World Health Organization classification of gliomas, it is proposed that Isocitrate Dehydrogenase (IDH) plays a key role. The prognosis of glioma is largely affected by IDH mutation status. Therefore, IDH mutation status needs to be predicted in advance before surgery. In the past decade, with the development of machine learning, more and more machine learning methods, especially deep learning methods, have been applied to the development of computer-aided diagnosis systems. At present, in this field, many deep learning and radiomics based methods have been proposed for IDH prediction using multimodal Magnetic Resonance Imaging (MRI). In this study, we proposed an intra- and inter-modality fusion model with invariant- and specific- constraints to improve the performance of IDH status prediction. First, MRI-based radiomics features were fused with deep learning features in each modality (intra-modality fusion) and then the features extracted from each modality of brain MRI were fused by using an inter-modality fusion model with invariant and specific constraints. We experimented our proposed method on the dataset provided by the Affiliated Hospital of Zhengzhou University in Zhengzhou, China and demonstrated the effectiveness of the proposed method. In our study, we propose two inter-modality fusion models, and our experimental results show that our best proposed method outperformed state-of-the-art methods with an accuracy of 0.79, precision of 0.80, recall of 0.75, and F1 score of 0.78. Thus, we predicted the IDH mutation status for glioma treatment with a 2% increase in accuracy and 4% increase in precision to predict the IDH mutation status for glioma treatment.
{"title":"An Intra- and Inter-Modality Fusion Model with Invariant- and Specific-Constraints Using MR Images for Prediction of Glioma Isocitrate Dehydrogenase Mutation Status","authors":"Xiaoyu Shi, Yinhao Li, Yen-wei Chen, Jingliang Cheng, J. Bai, Guohua Zhao","doi":"10.18178/joig.11.4.321-329","DOIUrl":"https://doi.org/10.18178/joig.11.4.321-329","url":null,"abstract":"In the 2021 World Health Organization classification of gliomas, it is proposed that Isocitrate Dehydrogenase (IDH) plays a key role. The prognosis of glioma is largely affected by IDH mutation status. Therefore, IDH mutation status needs to be predicted in advance before surgery. In the past decade, with the development of machine learning, more and more machine learning methods, especially deep learning methods, have been applied to the development of computer-aided diagnosis systems. At present, in this field, many deep learning and radiomics based methods have been proposed for IDH prediction using multimodal Magnetic Resonance Imaging (MRI). In this study, we proposed an intra- and inter-modality fusion model with invariant- and specific- constraints to improve the performance of IDH status prediction. First, MRI-based radiomics features were fused with deep learning features in each modality (intra-modality fusion) and then the features extracted from each modality of brain MRI were fused by using an inter-modality fusion model with invariant and specific constraints. We experimented our proposed method on the dataset provided by the Affiliated Hospital of Zhengzhou University in Zhengzhou, China and demonstrated the effectiveness of the proposed method. In our study, we propose two inter-modality fusion models, and our experimental results show that our best proposed method outperformed state-of-the-art methods with an accuracy of 0.79, precision of 0.80, recall of 0.75, and F1 score of 0.78. Thus, we predicted the IDH mutation status for glioma treatment with a 2% increase in accuracy and 4% increase in precision to predict the IDH mutation status for glioma treatment.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"5 21","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138623852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.384-390
Annisa Istiqomah Arrahmah, Rissa Rahmania, D. E. Saputra
Oil pipeline monitoring using Unmanned Airborne Vehicles (UAV) can be done by utilizing Deep Learning. Deep Learning can be used to automatically detect harmed or unauthorized objects near the pipeline for further action by the authority. Input video in the pipeline area taken from the UAV has unique characteristics. It has low resolution with dense composition object in the image. The detected object also has a small scale as the objects are far away from the UAV. Thus, the selection of the Deep Learning algorithm is important to get a desirable result with the following conditions. Single Shot Multi-Box (SSD) is one of the popular Deep Learning algorithms with fast calculation compared to others and suitable for real-time object detection. Previous works on this topic using low to medium altitude dataset (20–200 m). This paper provides an evaluation of SSD implementation to detect vehicles on high-altitude dataset (300 m). As much as 2482 dataset is fed into SSD architecture and trained to detect 3 class of vehicles. The result shows the mAP and mAR are 0.026360 and 0.067377, respectively. However, the low lost function value shows that the model is able to classify the object correctly. In conclusion, the SSD cannot process low density information to correctly locate the object.
{"title":"Evaluation of SSD Architecture for Small Size Object Detection: A Case Study on UAV Oil Pipeline MonitoringEvaluation of SSD Architecture for Small Size Object Detection: A Case Study on UAV Oil Pipeline Monitoring","authors":"Annisa Istiqomah Arrahmah, Rissa Rahmania, D. E. Saputra","doi":"10.18178/joig.11.4.384-390","DOIUrl":"https://doi.org/10.18178/joig.11.4.384-390","url":null,"abstract":"Oil pipeline monitoring using Unmanned Airborne Vehicles (UAV) can be done by utilizing Deep Learning. Deep Learning can be used to automatically detect harmed or unauthorized objects near the pipeline for further action by the authority. Input video in the pipeline area taken from the UAV has unique characteristics. It has low resolution with dense composition object in the image. The detected object also has a small scale as the objects are far away from the UAV. Thus, the selection of the Deep Learning algorithm is important to get a desirable result with the following conditions. Single Shot Multi-Box (SSD) is one of the popular Deep Learning algorithms with fast calculation compared to others and suitable for real-time object detection. Previous works on this topic using low to medium altitude dataset (20–200 m). This paper provides an evaluation of SSD implementation to detect vehicles on high-altitude dataset (300 m). As much as 2482 dataset is fed into SSD architecture and trained to detect 3 class of vehicles. The result shows the mAP and mAR are 0.026360 and 0.067377, respectively. However, the low lost function value shows that the model is able to classify the object correctly. In conclusion, the SSD cannot process low density information to correctly locate the object.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138614664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.18178/joig.11.4.397-404
Y. Yuhandri, A. Windarto, Muhammad Noor Hasan Siregar
Accurate brain tumor detection is crucial due to its high mortality rate. However, existing automated methods suffer from limited accuracy and high false-positive rates. In this study, we aimed to improve brain tumor classification by comparing 17 different classifiers organized into six groups: Decision Tree (DT) Model, Support Vector Machine (SVM), Naive Bayes Classifier, Logistic Regression, Generalized Linear Model (GLM) Classifier, and Neural Network. We utilized a dataset of 3,762 Magnetic Resonance Imaging (MRI) scans of brain tumors from Kaggle, with each image having dimensions of 240 × 240 pixels and labeled as tumor or non-tumor. Our approach involved three main steps: extracting visual information using 17 predictor classes, optimizing feature extraction through weight optimization, and comparing different sets of classifier models. We evaluated the models’ performance using the confusion matrix and Receiver Operating Characteristics (ROC) curves. Our results showed that optimizing feature selection and utilizing ensemble classifiers improved the accuracy of brain tumor classification. The DT Model with ensemble classifiers emerged as the best-performing classifier, achieving an accuracy of 98.11% and an AUC of 0.99. Notably, Random Tree (RT) exhibited the highest accuracy within the ensemble classifier set, with a significant increase compared to other models. Our proposed method outperformed the standard approach, demonstrating its potential for enhancing brain tumor detection accuracy. This study contributes to the field by providing a more accurate method for detecting brain tumors, potentially enabling earlier detection and improved patient outcomes. Future research should focus on further improving brain tumor diagnosis and treatment through the application of machine learning techniques.
{"title":"Improving Brain Tumor Classification Efficacy through the Application of Feature Selection and Ensemble Classifiers","authors":"Y. Yuhandri, A. Windarto, Muhammad Noor Hasan Siregar","doi":"10.18178/joig.11.4.397-404","DOIUrl":"https://doi.org/10.18178/joig.11.4.397-404","url":null,"abstract":"Accurate brain tumor detection is crucial due to its high mortality rate. However, existing automated methods suffer from limited accuracy and high false-positive rates. In this study, we aimed to improve brain tumor classification by comparing 17 different classifiers organized into six groups: Decision Tree (DT) Model, Support Vector Machine (SVM), Naive Bayes Classifier, Logistic Regression, Generalized Linear Model (GLM) Classifier, and Neural Network. We utilized a dataset of 3,762 Magnetic Resonance Imaging (MRI) scans of brain tumors from Kaggle, with each image having dimensions of 240 × 240 pixels and labeled as tumor or non-tumor. Our approach involved three main steps: extracting visual information using 17 predictor classes, optimizing feature extraction through weight optimization, and comparing different sets of classifier models. We evaluated the models’ performance using the confusion matrix and Receiver Operating Characteristics (ROC) curves. Our results showed that optimizing feature selection and utilizing ensemble classifiers improved the accuracy of brain tumor classification. The DT Model with ensemble classifiers emerged as the best-performing classifier, achieving an accuracy of 98.11% and an AUC of 0.99. Notably, Random Tree (RT) exhibited the highest accuracy within the ensemble classifier set, with a significant increase compared to other models. Our proposed method outperformed the standard approach, demonstrating its potential for enhancing brain tumor detection accuracy. This study contributes to the field by providing a more accurate method for detecting brain tumors, potentially enabling earlier detection and improved patient outcomes. Future research should focus on further improving brain tumor diagnosis and treatment through the application of machine learning techniques.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":" 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138614904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}