Pub Date : 2024-03-12DOI: 10.1142/s0219467825500603
Yun Liu
Once a fault occurs in the nanofiber sensor, the scientific and reliable three-dimensional (3D) human motion detection results will be compromised. It is necessary to accurately and rapidly perceive the fault signals of the nanofiber sensor and determine the type of fault, to enable it to continue operating in a sustained and stable manner. Therefore, we propose a fault signal perception method for 3D human motion detection nanofiber sensor based on multi-task deep learning. First, through obtaining the fault characteristic parameters of the nanofiber sensor, the fault of the nanofiber sensor is reconstructed to complete the fault location of the nanofiber sensor. Second, the fault signal of the nanofiber sensor is mapped by the penalty function, and the feature extraction model of the fault signal of the nanofiber sensor is constructed by combining the multi-task deep learning. Finally, the multi-task deep learning algorithm is used to calculate the sampling frequency of the fault signal, and the key variable information of the fault of the nanofiber sensor is extracted according to the amplitude of the state change of the nanofiber sensor, to realize the perception of the fault signal of the nanofiber sensor. The results show that the proposed method can accurately perceive the fault signal of a nanofiber sensor in 3D human motion detection, the maximum sensor fault location accuracy is 97%, and the maximum noise content of the fault signal is only 5 dB, which shows that the method can be widely used in fault signal perception.
{"title":"Fault Signal Perception of Nanofiber Sensor for 3D Human Motion Detection Using Multi-Task Deep Learning","authors":"Yun Liu","doi":"10.1142/s0219467825500603","DOIUrl":"https://doi.org/10.1142/s0219467825500603","url":null,"abstract":"Once a fault occurs in the nanofiber sensor, the scientific and reliable three-dimensional (3D) human motion detection results will be compromised. It is necessary to accurately and rapidly perceive the fault signals of the nanofiber sensor and determine the type of fault, to enable it to continue operating in a sustained and stable manner. Therefore, we propose a fault signal perception method for 3D human motion detection nanofiber sensor based on multi-task deep learning. First, through obtaining the fault characteristic parameters of the nanofiber sensor, the fault of the nanofiber sensor is reconstructed to complete the fault location of the nanofiber sensor. Second, the fault signal of the nanofiber sensor is mapped by the penalty function, and the feature extraction model of the fault signal of the nanofiber sensor is constructed by combining the multi-task deep learning. Finally, the multi-task deep learning algorithm is used to calculate the sampling frequency of the fault signal, and the key variable information of the fault of the nanofiber sensor is extracted according to the amplitude of the state change of the nanofiber sensor, to realize the perception of the fault signal of the nanofiber sensor. The results show that the proposed method can accurately perceive the fault signal of a nanofiber sensor in 3D human motion detection, the maximum sensor fault location accuracy is 97%, and the maximum noise content of the fault signal is only 5 dB, which shows that the method can be widely used in fault signal perception.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140248265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-12DOI: 10.1142/s0219467825500640
M. Niazi, Kambiz Rahbar
Recently, image segmentation based on graph cut methods has shown remarkable performance on a set of image data. Although the kernel graph cut method provides good performance, its performance is highly dependent on the data mapping to the transformation space and image features. The entropy-based kernel graph cut method is suitable for segmentation of textured images. Nonetheless, its segmentation quality remains significantly contingent on the accuracy and richness of feature space representation and kernel centers. This paper introduces an entropy-based kernel graph cut method, which leverages the discriminative feature space extracted from SqueezeNet, a deep neural network. The fusion of SqueezeNet’s features enriches the segmentation process by capturing high-level semantic information. Moreover, the extraction of kernel centers is refined through a weighted k-means approach, contributing further to the segmentation’s precision and effectiveness. The proposed method, while exploiting the benefits of suitable computational load of graph cut methods, will be a suitable alternative for segmenting textured images. Laboratory results have been taken on a set of well-known datasets that include textured shapes in order to evaluate the efficiency of the algorithm compared to other well-known methods in the field of kernel graph cut.
{"title":"Entropy Kernel Graph Cut Feature Space Enhancement with SqueezeNet Deep Neural Network for Textural Image Segmentation","authors":"M. Niazi, Kambiz Rahbar","doi":"10.1142/s0219467825500640","DOIUrl":"https://doi.org/10.1142/s0219467825500640","url":null,"abstract":"Recently, image segmentation based on graph cut methods has shown remarkable performance on a set of image data. Although the kernel graph cut method provides good performance, its performance is highly dependent on the data mapping to the transformation space and image features. The entropy-based kernel graph cut method is suitable for segmentation of textured images. Nonetheless, its segmentation quality remains significantly contingent on the accuracy and richness of feature space representation and kernel centers. This paper introduces an entropy-based kernel graph cut method, which leverages the discriminative feature space extracted from SqueezeNet, a deep neural network. The fusion of SqueezeNet’s features enriches the segmentation process by capturing high-level semantic information. Moreover, the extraction of kernel centers is refined through a weighted k-means approach, contributing further to the segmentation’s precision and effectiveness. The proposed method, while exploiting the benefits of suitable computational load of graph cut methods, will be a suitable alternative for segmenting textured images. Laboratory results have been taken on a set of well-known datasets that include textured shapes in order to evaluate the efficiency of the algorithm compared to other well-known methods in the field of kernel graph cut.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140249427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-06DOI: 10.1142/s021946782550069x
Meiling Chen, Yao Shi, Lvfen Zhu
The popularity of electronic products has increased with the development of technology. Electronic devices allow people to obtain information through the transmission of images. However, color distortion can occur during the transmission process, which may hinder the usefulness of the images. To this end, a deep residual network and a deep convolutional network were used to define the generator and discriminator. Then, self-attention-enhanced convolution was applied to the generator network to construct an image resolution correction model based on coupled generative adversarial networks. On this basis, a generative network model integrating multi-scale features and contextual attention mechanism was constructed to achieve image restoration. Finally, performance and image restoration application tests were conducted on the constructed model. The test showed that when the coupled generative adversarial network was tested on the Set5 dataset, the image peak signal-to-noise ratio and image structure similarity values were 31.2575 and 0.8173. On the Set14 dataset, they were 30.8521 and 0.8079, respectively. The multi-scale feature-fusion algorithm was tested on the BSDS100 dataset with an image peak signal-to-noise ratio of 30.2541 and an image structure similarity value of 0.8352. Based on the data presented, it can be concluded that the image correction model constructed in this study has a strong image restoration ability. The reconstructed image has the highest similarity with the real high-resolution image and a low distortion rate. It can achieve the task of repairing problems such as color distortion during image transmission. In addition, this study can provide technical support for similar information correction and restoration work.
{"title":"Application of Generative Adversarial Network in Image Color Correction","authors":"Meiling Chen, Yao Shi, Lvfen Zhu","doi":"10.1142/s021946782550069x","DOIUrl":"https://doi.org/10.1142/s021946782550069x","url":null,"abstract":"The popularity of electronic products has increased with the development of technology. Electronic devices allow people to obtain information through the transmission of images. However, color distortion can occur during the transmission process, which may hinder the usefulness of the images. To this end, a deep residual network and a deep convolutional network were used to define the generator and discriminator. Then, self-attention-enhanced convolution was applied to the generator network to construct an image resolution correction model based on coupled generative adversarial networks. On this basis, a generative network model integrating multi-scale features and contextual attention mechanism was constructed to achieve image restoration. Finally, performance and image restoration application tests were conducted on the constructed model. The test showed that when the coupled generative adversarial network was tested on the Set5 dataset, the image peak signal-to-noise ratio and image structure similarity values were 31.2575 and 0.8173. On the Set14 dataset, they were 30.8521 and 0.8079, respectively. The multi-scale feature-fusion algorithm was tested on the BSDS100 dataset with an image peak signal-to-noise ratio of 30.2541 and an image structure similarity value of 0.8352. Based on the data presented, it can be concluded that the image correction model constructed in this study has a strong image restoration ability. The reconstructed image has the highest similarity with the real high-resolution image and a low distortion rate. It can achieve the task of repairing problems such as color distortion during image transmission. In addition, this study can provide technical support for similar information correction and restoration work.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140078544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s0219467825500561
Asif Raza Butt, Zahid Ur Rahman, Anwar Ul Haq, Bilal Ahmed, Sajjad Manzoor
Recently, face recognition (FR) has become an important research topic due to increase in video surveillance. However, the surveillance images may have vague non-frontal faces, especially with the unidentifiable face pose or unconstrained environment such as bad illumination and dark environment. As a result, most FR algorithms would not show good performance when they are applied on these images. On the contrary, it is common at surveillance field that only Single Sample per Person (SSPP) is available for identification. In order to resolve such issues, visible spectrum infrared images were used which can work in entirely dark condition without having any light variations. Furthermore, to effectively improve FR for both the low-quality SSPP and unidentifiable pose problem, an approach to synthesize 3D face modeling and pose variations is proposed in this paper. A 2D frontal face image is used to generate a 3D face model. Then several virtual face test images with different poses are synthesized from this model. A well-known Surveillance Camera’s Face (SCface) database is utilized to evaluate the proposed algorithm by using PCA, LDA, KPCA, KFA, RSLDA, LRPP-GRR, deep KNN and DLIB deep learning. The effectiveness of the proposed method is verified through simulations, where increase in average recognition rates up to 10%, 27.69%, 14.62%, 25.38%, 57.46%, 57.43, 37.69% and 63.28%, respectively, for SCface database as observed.
{"title":"Unconstrained Face Recognition Using Infrared Images","authors":"Asif Raza Butt, Zahid Ur Rahman, Anwar Ul Haq, Bilal Ahmed, Sajjad Manzoor","doi":"10.1142/s0219467825500561","DOIUrl":"https://doi.org/10.1142/s0219467825500561","url":null,"abstract":"Recently, face recognition (FR) has become an important research topic due to increase in video surveillance. However, the surveillance images may have vague non-frontal faces, especially with the unidentifiable face pose or unconstrained environment such as bad illumination and dark environment. As a result, most FR algorithms would not show good performance when they are applied on these images. On the contrary, it is common at surveillance field that only Single Sample per Person (SSPP) is available for identification. In order to resolve such issues, visible spectrum infrared images were used which can work in entirely dark condition without having any light variations. Furthermore, to effectively improve FR for both the low-quality SSPP and unidentifiable pose problem, an approach to synthesize 3D face modeling and pose variations is proposed in this paper. A 2D frontal face image is used to generate a 3D face model. Then several virtual face test images with different poses are synthesized from this model. A well-known Surveillance Camera’s Face (SCface) database is utilized to evaluate the proposed algorithm by using PCA, LDA, KPCA, KFA, RSLDA, LRPP-GRR, deep KNN and DLIB deep learning. The effectiveness of the proposed method is verified through simulations, where increase in average recognition rates up to 10%, 27.69%, 14.62%, 25.38%, 57.46%, 57.43, 37.69% and 63.28%, respectively, for SCface database as observed.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139805549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s021946782550055x
Ajay Kumar, Naveen Hemrajani
Transmission control protocol (TCP) ensures that data are safely and accurately transported over the network for applications that use the transport protocol to allow reliable information delivery. Nowadays, internet usage in the network is growing and has been developing many protocols in the network layer. Congestion leads to packet loss, the high time required for data transmission in the TCP protocol transport layer for end-to-end connections is one of the biggest issues with the internet. An optimized random forest algorithm (RFA) with improved random early detection (IRED) for congestion prediction and avoidance in transport layer is proposed to overcome the drawbacks. Data are initially gathered and sent through data pre-processing to improve the data quality. For data pre-processing, KNN-based missing value imputation is applied to replace the values that are missing in raw data and [Formula: see text]-score normalization is utilized to scale the data in a certain range. Following that, congestion is predicted using an optimized RFA and whale optimization algorithm (WOA) is used to set the learning rate as efficiently as possible in order to reduce error and improve forecast accuracy. To avoid congestion, IRED method is utilized for a congestion-free network in the transport layer. Performance metrics are evaluated and compared with the existing techniques with respect to accuracy, precision, recall, specificity, and error, whose values that occur for the proposed model are 98%, 98%, 99%, 98%, and 1%. Throughput and latency are also evaluated in the proposed method to determine the performance of the network. Finally, the proposed method performs better when compared to the existing techniques and prediction, and avoidance of congestion is identified accurately in the network.
{"title":"Congestion Avoidance in TCP Based on Optimized Random Forest with Improved Random Early Detection Algorithm","authors":"Ajay Kumar, Naveen Hemrajani","doi":"10.1142/s021946782550055x","DOIUrl":"https://doi.org/10.1142/s021946782550055x","url":null,"abstract":"Transmission control protocol (TCP) ensures that data are safely and accurately transported over the network for applications that use the transport protocol to allow reliable information delivery. Nowadays, internet usage in the network is growing and has been developing many protocols in the network layer. Congestion leads to packet loss, the high time required for data transmission in the TCP protocol transport layer for end-to-end connections is one of the biggest issues with the internet. An optimized random forest algorithm (RFA) with improved random early detection (IRED) for congestion prediction and avoidance in transport layer is proposed to overcome the drawbacks. Data are initially gathered and sent through data pre-processing to improve the data quality. For data pre-processing, KNN-based missing value imputation is applied to replace the values that are missing in raw data and [Formula: see text]-score normalization is utilized to scale the data in a certain range. Following that, congestion is predicted using an optimized RFA and whale optimization algorithm (WOA) is used to set the learning rate as efficiently as possible in order to reduce error and improve forecast accuracy. To avoid congestion, IRED method is utilized for a congestion-free network in the transport layer. Performance metrics are evaluated and compared with the existing techniques with respect to accuracy, precision, recall, specificity, and error, whose values that occur for the proposed model are 98%, 98%, 99%, 98%, and 1%. Throughput and latency are also evaluated in the proposed method to determine the performance of the network. Finally, the proposed method performs better when compared to the existing techniques and prediction, and avoidance of congestion is identified accurately in the network.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139806205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s0219467825500500
Arunakumar Joshi, Shrinivasrao B. Kulkarni
In urban and city environments, road transportation contributes significantly to the generation of substantial traffic. However, this surge in vehicles leads to complex issues, including hindered emergency vehicle movement due to high density and congestion. Scarcity of human personnel amplifies these challenges. As traffic conditions worsen, the need for automated solutions to manage emergency situations becomes more evident. Intelligent traffic monitoring can identify and prioritize emergency vehicles, potentially saving lives. However, categorizing emergency vehicles through visual analysis faces difficulties such as clutter, occlusions, and traffic variations. Visual-based techniques for vehicle detection rely on clear rear views, but this is problematic in dense traffic. In contrast, audio-based methods are resilient to the Doppler Effect from moving vehicles, but handling diverse background noises remains unexplored. Using acoustics for emergency vehicle localization presents challenges related to sensor range and real-world noise. Addressing these issues, this study introduces a novel solution: combining visual and audio data for enhanced detection and localization of emergency vehicles in road networks. Leveraging this multi-modal approach aims to bolster accuracy and robustness in emergency vehicle management. The proposed methodology consists of several key steps. The presence of an emergency vehicle is initially detected through the preprocessing of visual images, involving the removal of clutter and occlusions via an adaptive background model. Subsequently, a cell-wise classification strategy utilizing a customized Visual Geometry Group Network (VGGNet) deep learning model is employed to determine the presence of emergency vehicles within individual cells. To further reinforce the accuracy of emergency vehicle presence detection, the outcomes from the audio data analysis are integrated. This involves the extraction of spectral features from audio streams, followed by classification utilizing a support vector machine (SVM) model. The fusion of information derived from both visual and audio sources is utilized in the construction of a more comprehensive and refined traffic state map. This augmented map facilitates the effective management of emergency vehicle transit. In empirical evaluations, the proposed solution demonstrates its capability to mitigate challenges like visual clutter, occlusions, and variations in traffic density common issues encountered in traditional visual analysis methods. Notably, the proposed approach achieves an impressive accuracy rate of approximately 98.15% in the localization of emergency vehicles.
{"title":"Multi-Modal Information Fusion for Localization of Emergency Vehicles","authors":"Arunakumar Joshi, Shrinivasrao B. Kulkarni","doi":"10.1142/s0219467825500500","DOIUrl":"https://doi.org/10.1142/s0219467825500500","url":null,"abstract":"In urban and city environments, road transportation contributes significantly to the generation of substantial traffic. However, this surge in vehicles leads to complex issues, including hindered emergency vehicle movement due to high density and congestion. Scarcity of human personnel amplifies these challenges. As traffic conditions worsen, the need for automated solutions to manage emergency situations becomes more evident. Intelligent traffic monitoring can identify and prioritize emergency vehicles, potentially saving lives. However, categorizing emergency vehicles through visual analysis faces difficulties such as clutter, occlusions, and traffic variations. Visual-based techniques for vehicle detection rely on clear rear views, but this is problematic in dense traffic. In contrast, audio-based methods are resilient to the Doppler Effect from moving vehicles, but handling diverse background noises remains unexplored. Using acoustics for emergency vehicle localization presents challenges related to sensor range and real-world noise. Addressing these issues, this study introduces a novel solution: combining visual and audio data for enhanced detection and localization of emergency vehicles in road networks. Leveraging this multi-modal approach aims to bolster accuracy and robustness in emergency vehicle management. The proposed methodology consists of several key steps. The presence of an emergency vehicle is initially detected through the preprocessing of visual images, involving the removal of clutter and occlusions via an adaptive background model. Subsequently, a cell-wise classification strategy utilizing a customized Visual Geometry Group Network (VGGNet) deep learning model is employed to determine the presence of emergency vehicles within individual cells. To further reinforce the accuracy of emergency vehicle presence detection, the outcomes from the audio data analysis are integrated. This involves the extraction of spectral features from audio streams, followed by classification utilizing a support vector machine (SVM) model. The fusion of information derived from both visual and audio sources is utilized in the construction of a more comprehensive and refined traffic state map. This augmented map facilitates the effective management of emergency vehicle transit. In empirical evaluations, the proposed solution demonstrates its capability to mitigate challenges like visual clutter, occlusions, and variations in traffic density common issues encountered in traditional visual analysis methods. Notably, the proposed approach achieves an impressive accuracy rate of approximately 98.15% in the localization of emergency vehicles.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139865007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s0219467825500561
Asif Raza Butt, Zahid Ur Rahman, Anwar Ul Haq, Bilal Ahmed, Sajjad Manzoor
Recently, face recognition (FR) has become an important research topic due to increase in video surveillance. However, the surveillance images may have vague non-frontal faces, especially with the unidentifiable face pose or unconstrained environment such as bad illumination and dark environment. As a result, most FR algorithms would not show good performance when they are applied on these images. On the contrary, it is common at surveillance field that only Single Sample per Person (SSPP) is available for identification. In order to resolve such issues, visible spectrum infrared images were used which can work in entirely dark condition without having any light variations. Furthermore, to effectively improve FR for both the low-quality SSPP and unidentifiable pose problem, an approach to synthesize 3D face modeling and pose variations is proposed in this paper. A 2D frontal face image is used to generate a 3D face model. Then several virtual face test images with different poses are synthesized from this model. A well-known Surveillance Camera’s Face (SCface) database is utilized to evaluate the proposed algorithm by using PCA, LDA, KPCA, KFA, RSLDA, LRPP-GRR, deep KNN and DLIB deep learning. The effectiveness of the proposed method is verified through simulations, where increase in average recognition rates up to 10%, 27.69%, 14.62%, 25.38%, 57.46%, 57.43, 37.69% and 63.28%, respectively, for SCface database as observed.
{"title":"Unconstrained Face Recognition Using Infrared Images","authors":"Asif Raza Butt, Zahid Ur Rahman, Anwar Ul Haq, Bilal Ahmed, Sajjad Manzoor","doi":"10.1142/s0219467825500561","DOIUrl":"https://doi.org/10.1142/s0219467825500561","url":null,"abstract":"Recently, face recognition (FR) has become an important research topic due to increase in video surveillance. However, the surveillance images may have vague non-frontal faces, especially with the unidentifiable face pose or unconstrained environment such as bad illumination and dark environment. As a result, most FR algorithms would not show good performance when they are applied on these images. On the contrary, it is common at surveillance field that only Single Sample per Person (SSPP) is available for identification. In order to resolve such issues, visible spectrum infrared images were used which can work in entirely dark condition without having any light variations. Furthermore, to effectively improve FR for both the low-quality SSPP and unidentifiable pose problem, an approach to synthesize 3D face modeling and pose variations is proposed in this paper. A 2D frontal face image is used to generate a 3D face model. Then several virtual face test images with different poses are synthesized from this model. A well-known Surveillance Camera’s Face (SCface) database is utilized to evaluate the proposed algorithm by using PCA, LDA, KPCA, KFA, RSLDA, LRPP-GRR, deep KNN and DLIB deep learning. The effectiveness of the proposed method is verified through simulations, where increase in average recognition rates up to 10%, 27.69%, 14.62%, 25.38%, 57.46%, 57.43, 37.69% and 63.28%, respectively, for SCface database as observed.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139865334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s021946782550055x
Ajay Kumar, Naveen Hemrajani
Transmission control protocol (TCP) ensures that data are safely and accurately transported over the network for applications that use the transport protocol to allow reliable information delivery. Nowadays, internet usage in the network is growing and has been developing many protocols in the network layer. Congestion leads to packet loss, the high time required for data transmission in the TCP protocol transport layer for end-to-end connections is one of the biggest issues with the internet. An optimized random forest algorithm (RFA) with improved random early detection (IRED) for congestion prediction and avoidance in transport layer is proposed to overcome the drawbacks. Data are initially gathered and sent through data pre-processing to improve the data quality. For data pre-processing, KNN-based missing value imputation is applied to replace the values that are missing in raw data and [Formula: see text]-score normalization is utilized to scale the data in a certain range. Following that, congestion is predicted using an optimized RFA and whale optimization algorithm (WOA) is used to set the learning rate as efficiently as possible in order to reduce error and improve forecast accuracy. To avoid congestion, IRED method is utilized for a congestion-free network in the transport layer. Performance metrics are evaluated and compared with the existing techniques with respect to accuracy, precision, recall, specificity, and error, whose values that occur for the proposed model are 98%, 98%, 99%, 98%, and 1%. Throughput and latency are also evaluated in the proposed method to determine the performance of the network. Finally, the proposed method performs better when compared to the existing techniques and prediction, and avoidance of congestion is identified accurately in the network.
{"title":"Congestion Avoidance in TCP Based on Optimized Random Forest with Improved Random Early Detection Algorithm","authors":"Ajay Kumar, Naveen Hemrajani","doi":"10.1142/s021946782550055x","DOIUrl":"https://doi.org/10.1142/s021946782550055x","url":null,"abstract":"Transmission control protocol (TCP) ensures that data are safely and accurately transported over the network for applications that use the transport protocol to allow reliable information delivery. Nowadays, internet usage in the network is growing and has been developing many protocols in the network layer. Congestion leads to packet loss, the high time required for data transmission in the TCP protocol transport layer for end-to-end connections is one of the biggest issues with the internet. An optimized random forest algorithm (RFA) with improved random early detection (IRED) for congestion prediction and avoidance in transport layer is proposed to overcome the drawbacks. Data are initially gathered and sent through data pre-processing to improve the data quality. For data pre-processing, KNN-based missing value imputation is applied to replace the values that are missing in raw data and [Formula: see text]-score normalization is utilized to scale the data in a certain range. Following that, congestion is predicted using an optimized RFA and whale optimization algorithm (WOA) is used to set the learning rate as efficiently as possible in order to reduce error and improve forecast accuracy. To avoid congestion, IRED method is utilized for a congestion-free network in the transport layer. Performance metrics are evaluated and compared with the existing techniques with respect to accuracy, precision, recall, specificity, and error, whose values that occur for the proposed model are 98%, 98%, 99%, 98%, and 1%. Throughput and latency are also evaluated in the proposed method to determine the performance of the network. Finally, the proposed method performs better when compared to the existing techniques and prediction, and avoidance of congestion is identified accurately in the network.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139865794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1142/s0219467825500500
Arunakumar Joshi, Shrinivasrao B. Kulkarni
In urban and city environments, road transportation contributes significantly to the generation of substantial traffic. However, this surge in vehicles leads to complex issues, including hindered emergency vehicle movement due to high density and congestion. Scarcity of human personnel amplifies these challenges. As traffic conditions worsen, the need for automated solutions to manage emergency situations becomes more evident. Intelligent traffic monitoring can identify and prioritize emergency vehicles, potentially saving lives. However, categorizing emergency vehicles through visual analysis faces difficulties such as clutter, occlusions, and traffic variations. Visual-based techniques for vehicle detection rely on clear rear views, but this is problematic in dense traffic. In contrast, audio-based methods are resilient to the Doppler Effect from moving vehicles, but handling diverse background noises remains unexplored. Using acoustics for emergency vehicle localization presents challenges related to sensor range and real-world noise. Addressing these issues, this study introduces a novel solution: combining visual and audio data for enhanced detection and localization of emergency vehicles in road networks. Leveraging this multi-modal approach aims to bolster accuracy and robustness in emergency vehicle management. The proposed methodology consists of several key steps. The presence of an emergency vehicle is initially detected through the preprocessing of visual images, involving the removal of clutter and occlusions via an adaptive background model. Subsequently, a cell-wise classification strategy utilizing a customized Visual Geometry Group Network (VGGNet) deep learning model is employed to determine the presence of emergency vehicles within individual cells. To further reinforce the accuracy of emergency vehicle presence detection, the outcomes from the audio data analysis are integrated. This involves the extraction of spectral features from audio streams, followed by classification utilizing a support vector machine (SVM) model. The fusion of information derived from both visual and audio sources is utilized in the construction of a more comprehensive and refined traffic state map. This augmented map facilitates the effective management of emergency vehicle transit. In empirical evaluations, the proposed solution demonstrates its capability to mitigate challenges like visual clutter, occlusions, and variations in traffic density common issues encountered in traditional visual analysis methods. Notably, the proposed approach achieves an impressive accuracy rate of approximately 98.15% in the localization of emergency vehicles.
{"title":"Multi-Modal Information Fusion for Localization of Emergency Vehicles","authors":"Arunakumar Joshi, Shrinivasrao B. Kulkarni","doi":"10.1142/s0219467825500500","DOIUrl":"https://doi.org/10.1142/s0219467825500500","url":null,"abstract":"In urban and city environments, road transportation contributes significantly to the generation of substantial traffic. However, this surge in vehicles leads to complex issues, including hindered emergency vehicle movement due to high density and congestion. Scarcity of human personnel amplifies these challenges. As traffic conditions worsen, the need for automated solutions to manage emergency situations becomes more evident. Intelligent traffic monitoring can identify and prioritize emergency vehicles, potentially saving lives. However, categorizing emergency vehicles through visual analysis faces difficulties such as clutter, occlusions, and traffic variations. Visual-based techniques for vehicle detection rely on clear rear views, but this is problematic in dense traffic. In contrast, audio-based methods are resilient to the Doppler Effect from moving vehicles, but handling diverse background noises remains unexplored. Using acoustics for emergency vehicle localization presents challenges related to sensor range and real-world noise. Addressing these issues, this study introduces a novel solution: combining visual and audio data for enhanced detection and localization of emergency vehicles in road networks. Leveraging this multi-modal approach aims to bolster accuracy and robustness in emergency vehicle management. The proposed methodology consists of several key steps. The presence of an emergency vehicle is initially detected through the preprocessing of visual images, involving the removal of clutter and occlusions via an adaptive background model. Subsequently, a cell-wise classification strategy utilizing a customized Visual Geometry Group Network (VGGNet) deep learning model is employed to determine the presence of emergency vehicles within individual cells. To further reinforce the accuracy of emergency vehicle presence detection, the outcomes from the audio data analysis are integrated. This involves the extraction of spectral features from audio streams, followed by classification utilizing a support vector machine (SVM) model. The fusion of information derived from both visual and audio sources is utilized in the construction of a more comprehensive and refined traffic state map. This augmented map facilitates the effective management of emergency vehicle transit. In empirical evaluations, the proposed solution demonstrates its capability to mitigate challenges like visual clutter, occlusions, and variations in traffic density common issues encountered in traditional visual analysis methods. Notably, the proposed approach achieves an impressive accuracy rate of approximately 98.15% in the localization of emergency vehicles.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139805323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-23DOI: 10.1142/s021946782550010x
Devendra Tiwari, Anand Gupta, Rituraj Soni
Text information extraction from a tabular structure within a compound document image (CDI) is crucial to help better understand the document. The main objective of text extraction is to extract only helpful information since tabular data represents the relation between text lying in a tuple. Text from an image may be of low contrast, different style, size, alignment, orientation, and complex background. This work presents a three-step tabular text extraction process, including pre-processing, separation, and extraction. The pre-processing step uses the guide image filter to remove various kinds of noise from the image. Improved binomial thresholding (IBT) separates the text from the image. Then the tabular text is recognized and extracted from CDI using deep neural network (DNN). In this work, weights of DNN layers are optimized by the Harris Hawk optimization algorithm (HHOA). Obtained text and associated information can be used in many ways, including replicating the document in digital format, information retrieval, and text summarization. The proposed process is applied comprehensively to UNLV, TableBank, and ICDAR 2013 image datasets. The complete procedure is implemented in Python, and precision metrics performance is verified.
{"title":"DNN-HHOA: Deep Neural Network Optimization-Based Tabular Data Extraction from Compound Document Images","authors":"Devendra Tiwari, Anand Gupta, Rituraj Soni","doi":"10.1142/s021946782550010x","DOIUrl":"https://doi.org/10.1142/s021946782550010x","url":null,"abstract":"Text information extraction from a tabular structure within a compound document image (CDI) is crucial to help better understand the document. The main objective of text extraction is to extract only helpful information since tabular data represents the relation between text lying in a tuple. Text from an image may be of low contrast, different style, size, alignment, orientation, and complex background. This work presents a three-step tabular text extraction process, including pre-processing, separation, and extraction. The pre-processing step uses the guide image filter to remove various kinds of noise from the image. Improved binomial thresholding (IBT) separates the text from the image. Then the tabular text is recognized and extracted from CDI using deep neural network (DNN). In this work, weights of DNN layers are optimized by the Harris Hawk optimization algorithm (HHOA). Obtained text and associated information can be used in many ways, including replicating the document in digital format, information retrieval, and text summarization. The proposed process is applied comprehensively to UNLV, TableBank, and ICDAR 2013 image datasets. The complete procedure is implemented in Python, and precision metrics performance is verified.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139605009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}