Pub Date : 2023-11-03DOI: 10.1142/s021946782550041x
Manoj Krishna Bhosale, Shubhangi B. Patil, Babasaheb B Patil
Recently, the increased count of surveillance cameras has manipulated the demand criteria for a higher effective video coding process. Moreover, the ultra-modern video coding standards have appreciably enhanced the efficiency of video coding, which has been developed for gathering common videos over surveillance videos. Various vehicle recognition techniques have provided a challenging and promising role in computer vision applications and intelligent transport systems. In this case, most of the conventional techniques have recognized the vehicles along with bounding box depiction and thus failed to provide the proper locations of the vehicles. Moreover, the position details have been vigorous in terms of various real-time applications trajectory of vehicle’s motion on the road as well as movement estimation. Numerous advancements have been offered throughout the years in the traffic surveillance area through the random propagation of intelligent traffic video surveillance techniques. The ultimate goal of this model is to design and enhance intelligent traffic video surveillance techniques by utilizing the developed deep learning techniques. This model has the ability to handle video traffic surveillance by measuring the speed of vehicles and recognizing their number plates. The initial process is considered the data collection, in which the traffic video data is gathered. Furthermore, the vehicle detection is performed by the Optimized YOLOv3 deep learning classifier, in which the parameter optimization is performed by using the newly recommended Modified Coyote Spider Monkey Optimization (MCSMO), which is the combination of Coyote Optimization Algorithm (COA) and Spider Monkey Optimization (SMO). Furthermore, the speed of the vehicles has been measured from each frame. For high-speed vehicles, the same Optimized YOLOv3 is used for detecting the number plates. Once the number plates are detected, plate character recognition is performed by the Improved Convolutional Neural Network (ICNN). Thus, the information about the vehicles, which are violating the traffic rules, can be conveyed to the vehicle owners and Regional Transport Office (RTO) to take further action to avoid accidents. From the experimental validation, the accuracy and precision rate of the designed method achieves 97.53% and 96.83%. Experimental results show that the proposed method achieves enhanced performance when compared to conventional models, thus ensuring the security of the transport system.
{"title":"Automatic Video Traffic Surveillance System with Number Plate Character Recognition Using Hybrid Optimization-Based YOLOv3 and Improved CNN","authors":"Manoj Krishna Bhosale, Shubhangi B. Patil, Babasaheb B Patil","doi":"10.1142/s021946782550041x","DOIUrl":"https://doi.org/10.1142/s021946782550041x","url":null,"abstract":"Recently, the increased count of surveillance cameras has manipulated the demand criteria for a higher effective video coding process. Moreover, the ultra-modern video coding standards have appreciably enhanced the efficiency of video coding, which has been developed for gathering common videos over surveillance videos. Various vehicle recognition techniques have provided a challenging and promising role in computer vision applications and intelligent transport systems. In this case, most of the conventional techniques have recognized the vehicles along with bounding box depiction and thus failed to provide the proper locations of the vehicles. Moreover, the position details have been vigorous in terms of various real-time applications trajectory of vehicle’s motion on the road as well as movement estimation. Numerous advancements have been offered throughout the years in the traffic surveillance area through the random propagation of intelligent traffic video surveillance techniques. The ultimate goal of this model is to design and enhance intelligent traffic video surveillance techniques by utilizing the developed deep learning techniques. This model has the ability to handle video traffic surveillance by measuring the speed of vehicles and recognizing their number plates. The initial process is considered the data collection, in which the traffic video data is gathered. Furthermore, the vehicle detection is performed by the Optimized YOLOv3 deep learning classifier, in which the parameter optimization is performed by using the newly recommended Modified Coyote Spider Monkey Optimization (MCSMO), which is the combination of Coyote Optimization Algorithm (COA) and Spider Monkey Optimization (SMO). Furthermore, the speed of the vehicles has been measured from each frame. For high-speed vehicles, the same Optimized YOLOv3 is used for detecting the number plates. Once the number plates are detected, plate character recognition is performed by the Improved Convolutional Neural Network (ICNN). Thus, the information about the vehicles, which are violating the traffic rules, can be conveyed to the vehicle owners and Regional Transport Office (RTO) to take further action to avoid accidents. From the experimental validation, the accuracy and precision rate of the designed method achieves 97.53% and 96.83%. Experimental results show that the proposed method achieves enhanced performance when compared to conventional models, thus ensuring the security of the transport system.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"28 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1142/s0219467825500433
B. N. Nithya, D. Evangelin Geetha, Manish Kumar
In today’s world, the web is a prominent communication channel. However, the variety of strategies available on event-based social networks (EBSNs) also makes it difficult for users to choose the events that are most relevant to their interests. In EBSNs, searching for events that better fit a user’s preferences are necessary, complex, and time consuming due to a large number of events available. Toward this end, a community-contributed data event recommender framework assists consumers in filtering daunting information and providing appropriate feedback, making EBSNs more appealing to them. A novel customized event recommendation system that uses the “multi-criteria decision-making (MCDM) approach” to rank the events is introduced in this research work. The calculation of categorical, geographical, temporal, and social factors is carried out in the proposed model, and the recommendation list is ordered using a contextual post-filtering system that includes Weight and Filter. To align the recommendation list, a new probabilistic weight model is added. To be more constructive, this model incorporates metaheuristic reasoning, which will fine-tune the probabilistic threshold value using a new hybrid algorithm. The proposed hybrid model is referred to as Beetle Swarm Hybridized Elephant Herding Algorithm (BSH-EHA), which combines the algorithms like Elephant Herding Optimization (EHO) and Beetle Swarm Optimization (BSO) Algorithm. Finally, the top recommendations will be given to the users.
{"title":"Metaheuristic-Assisted Contextual Post-Filtering Method for Event Recommendation System","authors":"B. N. Nithya, D. Evangelin Geetha, Manish Kumar","doi":"10.1142/s0219467825500433","DOIUrl":"https://doi.org/10.1142/s0219467825500433","url":null,"abstract":"In today’s world, the web is a prominent communication channel. However, the variety of strategies available on event-based social networks (EBSNs) also makes it difficult for users to choose the events that are most relevant to their interests. In EBSNs, searching for events that better fit a user’s preferences are necessary, complex, and time consuming due to a large number of events available. Toward this end, a community-contributed data event recommender framework assists consumers in filtering daunting information and providing appropriate feedback, making EBSNs more appealing to them. A novel customized event recommendation system that uses the “multi-criteria decision-making (MCDM) approach” to rank the events is introduced in this research work. The calculation of categorical, geographical, temporal, and social factors is carried out in the proposed model, and the recommendation list is ordered using a contextual post-filtering system that includes Weight and Filter. To align the recommendation list, a new probabilistic weight model is added. To be more constructive, this model incorporates metaheuristic reasoning, which will fine-tune the probabilistic threshold value using a new hybrid algorithm. The proposed hybrid model is referred to as Beetle Swarm Hybridized Elephant Herding Algorithm (BSH-EHA), which combines the algorithms like Elephant Herding Optimization (EHO) and Beetle Swarm Optimization (BSO) Algorithm. Finally, the top recommendations will be given to the users.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"28 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1142/s0219467825500391
Jampani Ravi, R. Narmadha
Imaging technology has undergone extensive development since 1985, which has practical implications concerning civilians and the military. Recently, image fusion is an emerging tool in image processing that is adept at handling diverse image types. Those image types include remote sensing images and medical images for upgrading the information through the fusion of visible and infrared light based on the analysis of the materials used. Presently, image fusion has been mainly performed in the medical industry. With the constraints of diagnosing a disease via single-modality images, image fusion could be able to meet up the prerequisites. Hence, it is further suggested to develop a fusion model using different modalities of images. The major intention of the fusion approach is to achieve higher contrast, enhancing the quality of images and apparent knowledge. The validation of fused images is done by three factors that are: (i) fused images should sustain significant information from the source images, (ii) artifacts must not be present in the fused images and (iii) the flaws of noise and misregistration must be evaded. Multimodal image fusion is one of the developing domains through the implementation of robust algorithms and standard transformation techniques. Thus, this work aims to analyze the different contributions of various multimodal image fusion models using intelligent methods. It will provide an extensive literature survey on image fusion techniques and comparison of those methods with the existing ones. It will offer various state-of-the-arts of image fusion methods with their diverse levels as well as their pros and cons. This review will give an introduction to the current fusion methods, modes of multimodal fusion, the datasets used and performance metrics; and finally, it also discusses the challenges of multimodal image fusion methods and the future research trends.
{"title":"A Systematic Literature Review on Multimodal Image Fusion Models With Challenges and Future Research Trends","authors":"Jampani Ravi, R. Narmadha","doi":"10.1142/s0219467825500391","DOIUrl":"https://doi.org/10.1142/s0219467825500391","url":null,"abstract":"Imaging technology has undergone extensive development since 1985, which has practical implications concerning civilians and the military. Recently, image fusion is an emerging tool in image processing that is adept at handling diverse image types. Those image types include remote sensing images and medical images for upgrading the information through the fusion of visible and infrared light based on the analysis of the materials used. Presently, image fusion has been mainly performed in the medical industry. With the constraints of diagnosing a disease via single-modality images, image fusion could be able to meet up the prerequisites. Hence, it is further suggested to develop a fusion model using different modalities of images. The major intention of the fusion approach is to achieve higher contrast, enhancing the quality of images and apparent knowledge. The validation of fused images is done by three factors that are: (i) fused images should sustain significant information from the source images, (ii) artifacts must not be present in the fused images and (iii) the flaws of noise and misregistration must be evaded. Multimodal image fusion is one of the developing domains through the implementation of robust algorithms and standard transformation techniques. Thus, this work aims to analyze the different contributions of various multimodal image fusion models using intelligent methods. It will provide an extensive literature survey on image fusion techniques and comparison of those methods with the existing ones. It will offer various state-of-the-arts of image fusion methods with their diverse levels as well as their pros and cons. This review will give an introduction to the current fusion methods, modes of multimodal fusion, the datasets used and performance metrics; and finally, it also discusses the challenges of multimodal image fusion methods and the future research trends.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"28 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01DOI: 10.1142/s0219467825500366
Yuze Zhou, Liwei Yan, Qi Zhu
As a kind of promising biometric technology, multispectral palmprint recognition methods have attracted increasing attention in security due to their high recognition accuracy and ease of use. It is worth noting that although multispectral palmprint data contains rich complementary information, multispectral palmprint recognition methods are still vulnerable to adversarial attacks. Even if only one image of a spectrum is attacked, it can have a catastrophic impact on the recognition results. Therefore, we propose a robustness-enhanced multispectral palmprint recognition method, including a model interpretability-based adversarial detection module and a robust multispectral fusion module. Inspired by the model interpretation technology, we found there is a large difference between clean palmprint and adversarial examples after CAM visualization. Using visualized images to build an adversarial detector can lead to better detection results. Finally, the weights of clean images and adversarial examples in the fusion layer are dynamically adjusted to obtain the correct recognition results. Experiments have shown that our method can make full use of the image features that are not attacked and can effectively improve the robustness of the model.
{"title":"Adversarial Detection and Fusion Method for Multispectral Palmprint Recognition","authors":"Yuze Zhou, Liwei Yan, Qi Zhu","doi":"10.1142/s0219467825500366","DOIUrl":"https://doi.org/10.1142/s0219467825500366","url":null,"abstract":"As a kind of promising biometric technology, multispectral palmprint recognition methods have attracted increasing attention in security due to their high recognition accuracy and ease of use. It is worth noting that although multispectral palmprint data contains rich complementary information, multispectral palmprint recognition methods are still vulnerable to adversarial attacks. Even if only one image of a spectrum is attacked, it can have a catastrophic impact on the recognition results. Therefore, we propose a robustness-enhanced multispectral palmprint recognition method, including a model interpretability-based adversarial detection module and a robust multispectral fusion module. Inspired by the model interpretation technology, we found there is a large difference between clean palmprint and adversarial examples after CAM visualization. Using visualized images to build an adversarial detector can lead to better detection results. Finally, the weights of clean images and adversarial examples in the fusion layer are dynamically adjusted to obtain the correct recognition results. Experiments have shown that our method can make full use of the image features that are not attacked and can effectively improve the robustness of the model.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"20 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135410857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01DOI: 10.1142/s0219467823990012
{"title":"Author Index (Volume 23)","authors":"","doi":"10.1142/s0219467823990012","DOIUrl":"https://doi.org/10.1142/s0219467823990012","url":null,"abstract":"","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"75 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139300900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01DOI: 10.1142/s0219467825500378
Zhipeng Li, Jun Wang, Lijun Hua, Honghui Liu, Wenli Song
Automatic tracking of three-dimensional (3D) human motion pose has the potential to provide corresponding technical support in various fields. However, existing methods for tracking human motion pose suffer from significant errors, long tracking times and suboptimal tracking results. To address these issues, an automatic tracking method for 3D human motion pose using contrastive learning is proposed. By using the feature parameters of 3D human motion poses, threshold variation parameters of 3D human motion poses are computed. The golden section is introduced to transform the threshold variation parameters and extract the features of 3D human motion poses by comparing the feature parameters with the threshold of parameter variation. Under the supervision of contrastive learning, a constraint loss is added to the local–global deep supervision module of contrastive learning to extract local parameters of 3D human motion poses, combined with their local features. After normalizing the 3D human motion pose images, frame differences of the background image are calculated. By constructing an automatic tracking model for 3D human motion poses, automatic tracking of 3D human motion poses is achieved. Experimental results demonstrate that the highest tracking lag is 9%, there is no deviation in node tracking, the pixel contrast is maintained above 90% and only 6 sub-blocks have detail loss. This indicates that the proposed method effectively tracks 3D human motion poses, tracks all the nodes, achieves high accuracy in automatic tracking and produces good tracking results.
{"title":"Automatic Tracking Method for 3D Human Motion Pose Using Contrastive Learning","authors":"Zhipeng Li, Jun Wang, Lijun Hua, Honghui Liu, Wenli Song","doi":"10.1142/s0219467825500378","DOIUrl":"https://doi.org/10.1142/s0219467825500378","url":null,"abstract":"Automatic tracking of three-dimensional (3D) human motion pose has the potential to provide corresponding technical support in various fields. However, existing methods for tracking human motion pose suffer from significant errors, long tracking times and suboptimal tracking results. To address these issues, an automatic tracking method for 3D human motion pose using contrastive learning is proposed. By using the feature parameters of 3D human motion poses, threshold variation parameters of 3D human motion poses are computed. The golden section is introduced to transform the threshold variation parameters and extract the features of 3D human motion poses by comparing the feature parameters with the threshold of parameter variation. Under the supervision of contrastive learning, a constraint loss is added to the local–global deep supervision module of contrastive learning to extract local parameters of 3D human motion poses, combined with their local features. After normalizing the 3D human motion pose images, frame differences of the background image are calculated. By constructing an automatic tracking model for 3D human motion poses, automatic tracking of 3D human motion poses is achieved. Experimental results demonstrate that the highest tracking lag is 9%, there is no deviation in node tracking, the pixel contrast is maintained above 90% and only 6 sub-blocks have detail loss. This indicates that the proposed method effectively tracks 3D human motion poses, tracks all the nodes, achieves high accuracy in automatic tracking and produces good tracking results.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"229 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135371900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-20DOI: 10.1142/s0219467825500445
Tesfayee Meshu Welde, Lejian Liao
Visual Question Answering (VQA) is a language-based method for analyzing images, which is highly helpful in assisting people with visual impairment. The VQA system requires a demonstrated holistic image understanding and conducts basic reasoning tasks concerning the image in contrast to the specific task-oriented models that simply classifies object into categories. Thus, VQA systems contribute to the growth of Artificial Intelligence (AI) technology by answering open-ended, arbitrary questions about a given image. In addition, VQA is also used to assess the system’s ability by conducting Visual Turing Test (VTT). However, because of the inability to generate the essential datasets and being incapable of evaluating the systems due to flawlessness and bias, the VQA system is incapable of assessing the system’s overall efficiency. This is seen as a possible and significant limitation of the VQA system. This, in turn, has a negative impact on the progress of performance observed in VQA algorithms. Currently, the research on the VQA system is dealing with more specific sub-problems, which include counting in VQA systems. The counting sub-problem of VQA is a more sophisticated one, riddling with several challenging questions, especially when it comes to complex counting questions such as those that demand object identifications along with detection of objects attributes and positional reasoning. The pooling operation that is considered to perform an attention mechanism in VQA is found to degrade the counting performance. A number of algorithms have been developed to address this issue. In this paper, we provide a comprehensive survey of counting techniques in the VQA system that is developed especially for answering questions such as “How many?”. However, the performance progress achieved by this system is still not satisfactory due to bias that occurs in the datasets from the way we phrase the questions and because of weak evaluation metrics. In the future, fully-fledged architecture, wide-size datasets with complex counting questions and a detailed breakdown in categories, and strong evaluation metrics for evaluating the ability of the system to answer complex counting questions, such as positional and comparative reasoning will be executed.
{"title":"Counting in Visual Question Answering: Methods, Datasets, and Future Work","authors":"Tesfayee Meshu Welde, Lejian Liao","doi":"10.1142/s0219467825500445","DOIUrl":"https://doi.org/10.1142/s0219467825500445","url":null,"abstract":"Visual Question Answering (VQA) is a language-based method for analyzing images, which is highly helpful in assisting people with visual impairment. The VQA system requires a demonstrated holistic image understanding and conducts basic reasoning tasks concerning the image in contrast to the specific task-oriented models that simply classifies object into categories. Thus, VQA systems contribute to the growth of Artificial Intelligence (AI) technology by answering open-ended, arbitrary questions about a given image. In addition, VQA is also used to assess the system’s ability by conducting Visual Turing Test (VTT). However, because of the inability to generate the essential datasets and being incapable of evaluating the systems due to flawlessness and bias, the VQA system is incapable of assessing the system’s overall efficiency. This is seen as a possible and significant limitation of the VQA system. This, in turn, has a negative impact on the progress of performance observed in VQA algorithms. Currently, the research on the VQA system is dealing with more specific sub-problems, which include counting in VQA systems. The counting sub-problem of VQA is a more sophisticated one, riddling with several challenging questions, especially when it comes to complex counting questions such as those that demand object identifications along with detection of objects attributes and positional reasoning. The pooling operation that is considered to perform an attention mechanism in VQA is found to degrade the counting performance. A number of algorithms have been developed to address this issue. In this paper, we provide a comprehensive survey of counting techniques in the VQA system that is developed especially for answering questions such as “How many?”. However, the performance progress achieved by this system is still not satisfactory due to bias that occurs in the datasets from the way we phrase the questions and because of weak evaluation metrics. In the future, fully-fledged architecture, wide-size datasets with complex counting questions and a detailed breakdown in categories, and strong evaluation metrics for evaluating the ability of the system to answer complex counting questions, such as positional and comparative reasoning will be executed.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135618886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-14DOI: 10.1142/s0219467825500305
Manbir Sandhu, Sumit Kushwaha, Tanvi Arora
Computed Tomography (CT) offers great visualization of the intricate internal body structures. To protect a patient from the potential radiation-related health risks, the acquisition of CT images should adhere to the “as low as reasonably allowed” (ALARA) standard. However, the acquired Low-dose CT (LDCT) images are inadvertently corrupted by artifacts and noise during the processes of acquisition, storage, and transmission, degrading the visual quality of the image and also causing the loss of image features and relevant information. Most recently, generative adversarial network (GAN) models based on deep learning (DL) have demonstrated ground-breaking performance to minimize image noise while maintaining high image quality. These models’ ability to adapt to uncertain noise distributions and representation-learning ability makes them highly desirable for the denoising of CT images. The state-of-the-art GANs used for LDCT image denoising have been comprehensively reviewed in this research paper. The aim of this paper is to highlight the potential of DL-based GAN for CT dose optimization and present future scope of research in the domain of LDCT image denoising.
{"title":"A Comprehensive Review of GAN-Based Denoising Models for Low-Dose Computed Tomography Images","authors":"Manbir Sandhu, Sumit Kushwaha, Tanvi Arora","doi":"10.1142/s0219467825500305","DOIUrl":"https://doi.org/10.1142/s0219467825500305","url":null,"abstract":"Computed Tomography (CT) offers great visualization of the intricate internal body structures. To protect a patient from the potential radiation-related health risks, the acquisition of CT images should adhere to the “as low as reasonably allowed” (ALARA) standard. However, the acquired Low-dose CT (LDCT) images are inadvertently corrupted by artifacts and noise during the processes of acquisition, storage, and transmission, degrading the visual quality of the image and also causing the loss of image features and relevant information. Most recently, generative adversarial network (GAN) models based on deep learning (DL) have demonstrated ground-breaking performance to minimize image noise while maintaining high image quality. These models’ ability to adapt to uncertain noise distributions and representation-learning ability makes them highly desirable for the denoising of CT images. The state-of-the-art GANs used for LDCT image denoising have been comprehensively reviewed in this research paper. The aim of this paper is to highlight the potential of DL-based GAN for CT dose optimization and present future scope of research in the domain of LDCT image denoising.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135803438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-26DOI: 10.1142/s0219467825500214
P. John Bosco, S. Janakiraman
Content-Based Image Retrieval (CBIR) is a broad research field in the current digital world. This paper focuses on content-based image retrieval based on visual properties, consisting of high-level semantic information. The variation between low-level and high-level features is identified as a semantic gap. The semantic gap is the biggest problem in CBIR. The visual characteristics are extracted from low-level features such as color, texture and shape. The low-level feature increases CBIRs performance level. The paper mainly focuses on an image retrieval system called combined color (TriCLR) (RGB, YCbCr, and [Formula: see text]) with the histogram of texture features in LBP (HistLBP), which, is known as a hybrid of three colors (TriCLR) with Histogram of LBP (TriCLR and HistLBP). The study also discusses the hybrid method in light of low-level features. Finally, the hybrid approach uses the (TriCLR and HistLBP) algorithm, which provides a new solution to the CBIR system that is better than the existing methods.
{"title":"Content-Based Image Retrieval (CBIR): Using Combined Color and Texture Features (TriCLR and HistLBP)","authors":"P. John Bosco, S. Janakiraman","doi":"10.1142/s0219467825500214","DOIUrl":"https://doi.org/10.1142/s0219467825500214","url":null,"abstract":"Content-Based Image Retrieval (CBIR) is a broad research field in the current digital world. This paper focuses on content-based image retrieval based on visual properties, consisting of high-level semantic information. The variation between low-level and high-level features is identified as a semantic gap. The semantic gap is the biggest problem in CBIR. The visual characteristics are extracted from low-level features such as color, texture and shape. The low-level feature increases CBIRs performance level. The paper mainly focuses on an image retrieval system called combined color (TriCLR) (RGB, YCbCr, and [Formula: see text]) with the histogram of texture features in LBP (HistLBP), which, is known as a hybrid of three colors (TriCLR) with Histogram of LBP (TriCLR and HistLBP). The study also discusses the hybrid method in light of low-level features. Finally, the hybrid approach uses the (TriCLR and HistLBP) algorithm, which provides a new solution to the CBIR system that is better than the existing methods.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135718979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-25DOI: 10.1142/s0219467825500329
R. S. Rajasree, S. Brintha Rajakumari
Machine learning (ML) and deep learning (DL) techniques can considerably enhance the process of making a precise diagnosis of Alzheimer’s disease (AD). Recently, DL techniques have had considerable success in processing medical data. They still have drawbacks, like large data requirements and a protracted training phase. With this concern, we have developed a novel strategy with the four stages. In the initial stage, the input data is subjected to data imbalance processing, which is crucial for enhancing the accuracy of disease detection. Subsequently, entropy-based, correlation-based, and improved mutual information-based features will be extracted from these pre-processed data. However, the curse of dimensionality will be a serious issue in this work, and hence we have sorted it out via optimization strategy. Particularly, the tunicate updated golden eagle optimization (TUGEO) algorithm is proposed to pick out the optimal features from the extracted features. Finally, the ensemble classifier, which integrates models like CNN, DBN, and improved RNN is modeled to diagnose the diseases by training the selected optimal features from the previous stage. The suggested model achieves the maximum F-measure as 97.67, which is better than the extant methods like [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], respectively. The suggested TUGEO-based AD detection is then compared to the traditional models like various performance matrices including accuracy, sensitivity, specificity, and precision.
{"title":"Deep Ensemble of Classifiers for Alzheimer’s Disease Detection with Optimal Feature Set","authors":"R. S. Rajasree, S. Brintha Rajakumari","doi":"10.1142/s0219467825500329","DOIUrl":"https://doi.org/10.1142/s0219467825500329","url":null,"abstract":"Machine learning (ML) and deep learning (DL) techniques can considerably enhance the process of making a precise diagnosis of Alzheimer’s disease (AD). Recently, DL techniques have had considerable success in processing medical data. They still have drawbacks, like large data requirements and a protracted training phase. With this concern, we have developed a novel strategy with the four stages. In the initial stage, the input data is subjected to data imbalance processing, which is crucial for enhancing the accuracy of disease detection. Subsequently, entropy-based, correlation-based, and improved mutual information-based features will be extracted from these pre-processed data. However, the curse of dimensionality will be a serious issue in this work, and hence we have sorted it out via optimization strategy. Particularly, the tunicate updated golden eagle optimization (TUGEO) algorithm is proposed to pick out the optimal features from the extracted features. Finally, the ensemble classifier, which integrates models like CNN, DBN, and improved RNN is modeled to diagnose the diseases by training the selected optimal features from the previous stage. The suggested model achieves the maximum F-measure as 97.67, which is better than the extant methods like [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], respectively. The suggested TUGEO-based AD detection is then compared to the traditional models like various performance matrices including accuracy, sensitivity, specificity, and precision.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135816967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}