Pub Date : 2024-06-15DOI: 10.1007/s12652-024-04820-z
Yasmeen A. Kassem, S. Kishk, Mohamed A. Yakout, Doaa A. Altantawy
{"title":"LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation","authors":"Yasmeen A. Kassem, S. Kishk, Mohamed A. Yakout, Doaa A. Altantawy","doi":"10.1007/s12652-024-04820-z","DOIUrl":"https://doi.org/10.1007/s12652-024-04820-z","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141336895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-14DOI: 10.1007/s12652-024-04819-6
Rokaya Safwat, Eman Shaaban, S. Al-Tabbakh, Karim Emara
{"title":"Rf-based fingerprinting for indoor localization: deep transfer learning approach","authors":"Rokaya Safwat, Eman Shaaban, S. Al-Tabbakh, Karim Emara","doi":"10.1007/s12652-024-04819-6","DOIUrl":"https://doi.org/10.1007/s12652-024-04819-6","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141340068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12DOI: 10.1007/s12652-024-04817-8
Totan Garai
{"title":"$$lambda $$-possibility-center based MCDM technique on the control of Ganga river pollution under non-linear pentagonal fuzzy environment","authors":"Totan Garai","doi":"10.1007/s12652-024-04817-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04817-8","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141353039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-07DOI: 10.1007/s12652-024-04821-y
Habibie Akbar, Muhammad Munwar Iqbal
{"title":"Attention based: modeling human perception of reflectional symmetry in the wild","authors":"Habibie Akbar, Muhammad Munwar Iqbal","doi":"10.1007/s12652-024-04821-y","DOIUrl":"https://doi.org/10.1007/s12652-024-04821-y","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141375021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1007/s12652-024-04816-9
Asif Iqbal Middya, Sarbani Roy
The missing readings in various sensors of air pollution monitoring stations is a common issue. Those missing sensor readings may greatly influence the performance of monitoring and analysis of air pollution data. To address this problem, in this paper, a multi-view based missing value (MV) imputation method called MVDI (Multi-View Data Imputation) is proposed for air pollution related time series data. MVDI combines four models namely LSTM (Long-Short Term Memory), IDS (Inverse Distance Squared), SVR (Support Vector Regressor), and KNN (K-Nearest Neighbors) to estimate MVs. These four models are mainly employed to capture the variations in data from different views of the dataset. Here, different views represent different portions (subsets) of the actual dataset. The estimates of MVs from all the views are combined using a kernel function to get an overall result. The proposed model MVDI is evaluated on real-world air pollution dataset in terms of RMSE, MAE, MAPE, and R2. The experimental results show that MVDI dominates over the baseline methods namely AR (AutoRegressive), ARIMA (AutoRegressive Integrated Moving Average), RFR (Random Forest Regressor), ANN (Artificial Neural Network), LI (Linear Interpolation), NN (Nearest Neighbors), MI (Mean Imputation), CNN (Convolutional Neural Network), ConvLSTM (Convolutional LSTM).
{"title":"Multiview data fusion technique for missing value imputation in multisensory air pollution dataset","authors":"Asif Iqbal Middya, Sarbani Roy","doi":"10.1007/s12652-024-04816-9","DOIUrl":"https://doi.org/10.1007/s12652-024-04816-9","url":null,"abstract":"<p>The missing readings in various sensors of air pollution monitoring stations is a common issue. Those missing sensor readings may greatly influence the performance of monitoring and analysis of air pollution data. To address this problem, in this paper, a multi-view based missing value (MV) imputation method called MVDI (<b>M</b>ulti-<b>V</b>iew <b>D</b>ata <b>I</b>mputation) is proposed for air pollution related time series data. MVDI combines four models namely LSTM (Long-Short Term Memory), IDS (Inverse Distance Squared), SVR (Support Vector Regressor), and KNN (K-Nearest Neighbors) to estimate MVs. These four models are mainly employed to capture the variations in data from different views of the dataset. Here, different views represent different portions (subsets) of the actual dataset. The estimates of MVs from all the views are combined using a kernel function to get an overall result. The proposed model MVDI is evaluated on real-world air pollution dataset in terms of RMSE, MAE, MAPE, and R<sup>2</sup>. The experimental results show that MVDI dominates over the baseline methods namely AR (AutoRegressive), ARIMA (AutoRegressive Integrated Moving Average), RFR (Random Forest Regressor), ANN (Artificial Neural Network), LI (Linear Interpolation), NN (Nearest Neighbors), MI (Mean Imputation), CNN (Convolutional Neural Network), ConvLSTM (Convolutional LSTM).</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1007/s12652-024-04815-w
William Eric Manongga, Rung-Ching Chen
A road intersection is an area where more than two roads in different directions connect. It is a point of transition where the driver navigates and makes the decision, making it an area with a high risk for traffic accidents. Road intersection detection is identifying and analyzing road intersections in real time using various technologies and algorithms. It is an essential part of intelligent transportation systems and autonomous driving. Road intersection detection helps the driver to identify the road intersection early to make good driving decisions and avoid accidents. Despite its high importance, only a few research is found regarding this topic. Existing research mainly focuses on detecting and classifying traffic signs, vehicles, and pedestrians. In this research, we propose an algorithm to detect road intersections using an image from the front-facing camera installed on the car as an input. We use traffic sign detection to detect seven types of traffic signs having a high probability of intersection nearby and combine it with our novel road intersection detection algorithm to detect the location of the road intersection. Our road inter-section detection algorithm leverages the relationship between the area of the traffic signs and the location of the intersection. Our proposed method gives promising results from the experiments and can detect road intersections from further distances. Our method is also able to perform detection in real time.
{"title":"Road intersection detection using the YOLO model based on traffic signs and road signs","authors":"William Eric Manongga, Rung-Ching Chen","doi":"10.1007/s12652-024-04815-w","DOIUrl":"https://doi.org/10.1007/s12652-024-04815-w","url":null,"abstract":"<p>A road intersection is an area where more than two roads in different directions connect. It is a point of transition where the driver navigates and makes the decision, making it an area with a high risk for traffic accidents. Road intersection detection is identifying and analyzing road intersections in real time using various technologies and algorithms. It is an essential part of intelligent transportation systems and autonomous driving. Road intersection detection helps the driver to identify the road intersection early to make good driving decisions and avoid accidents. Despite its high importance, only a few research is found regarding this topic. Existing research mainly focuses on detecting and classifying traffic signs, vehicles, and pedestrians. In this research, we propose an algorithm to detect road intersections using an image from the front-facing camera installed on the car as an input. We use traffic sign detection to detect seven types of traffic signs having a high probability of intersection nearby and combine it with our novel road intersection detection algorithm to detect the location of the road intersection. Our road inter-section detection algorithm leverages the relationship between the area of the traffic signs and the location of the intersection. Our proposed method gives promising results from the experiments and can detect road intersections from further distances. Our method is also able to perform detection in real time.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Energy disaggregation, or Non-Intrusive Load Monitoring (NILM), involves different methods aiming to distinguish the individual contribution of appliances, given the aggregated power signal. In this paper, the application of finite Generalized Gaussian and finite Gamma mixtures in energy disaggregation is proposed and investigated. The procedure includes approximation of the distribution of the sum of two Generalized Gaussian random variables (RVs) and the approximation of the distribution of the sum of two Gamma RVs using Method-of-Moments matching. By adopting this procedure, the probability distribution of each combination of appliances consumption is acquired to predict and disaggregate the specific device data from the aggregated data. Moreover, to make the models more practical we propose a deep version, that we call DNN-Mixture, as a cascade model, which is a combination of a deep neural network and each of the proposed mixture models. As part of our extensive evaluation process, we apply the proposed models on three different datasets, from different geographical locations, that had different sampling rates. The results indicate the superiority of proposed models as compared to the Gaussian mixture model and other widely used approaches. In order to investigate the applicability of our models in challenging unsupervised settings, we tested them on unseen houses with unlabeled data. The outcomes proved the extensibility and robustness of the proposed approach. Finally, the evaluation of the cascade model against the state of the art shows that by benefiting from the advantages of both neural networks and finite mixtures, cascade model can produce promising and competing results with RNN without suffering from its inherent disadvantages.
{"title":"Non intrusive load monitoring using additive time series modeling via finite mixture models aggregation","authors":"Soudabeh Tabarsaii, Manar Amayri, Nizar Bouguila, Ursula Eicker","doi":"10.1007/s12652-024-04814-x","DOIUrl":"https://doi.org/10.1007/s12652-024-04814-x","url":null,"abstract":"<p>Energy disaggregation, or Non-Intrusive Load Monitoring (NILM), involves different methods aiming to distinguish the individual contribution of appliances, given the aggregated power signal. In this paper, the application of finite Generalized Gaussian and finite Gamma mixtures in energy disaggregation is proposed and investigated. The procedure includes approximation of the distribution of the sum of two Generalized Gaussian random variables (RVs) and the approximation of the distribution of the sum of two Gamma RVs using Method-of-Moments matching. By adopting this procedure, the probability distribution of each combination of appliances consumption is acquired to predict and disaggregate the specific device data from the aggregated data. Moreover, to make the models more practical we propose a deep version, that we call DNN-Mixture, as a cascade model, which is a combination of a deep neural network and each of the proposed mixture models. As part of our extensive evaluation process, we apply the proposed models on three different datasets, from different geographical locations, that had different sampling rates. The results indicate the superiority of proposed models as compared to the Gaussian mixture model and other widely used approaches. In order to investigate the applicability of our models in challenging unsupervised settings, we tested them on unseen houses with unlabeled data. The outcomes proved the extensibility and robustness of the proposed approach. Finally, the evaluation of the cascade model against the state of the art shows that by benefiting from the advantages of both neural networks and finite mixtures, cascade model can produce promising and competing results with RNN without suffering from its inherent disadvantages.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-29DOI: 10.1007/s12652-024-04818-7
Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif
Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.
{"title":"Suspicious activities detection using spatial–temporal features based on vision transformer and recurrent neural network","authors":"Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif","doi":"10.1007/s12652-024-04818-7","DOIUrl":"https://doi.org/10.1007/s12652-024-04818-7","url":null,"abstract":"<p>Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s12652-024-04809-8
Lu Lianju, Zhang Haiying
Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.
{"title":"Research on badminton take-off recognition method based on improved deep learning","authors":"Lu Lianju, Zhang Haiying","doi":"10.1007/s12652-024-04809-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04809-8","url":null,"abstract":"<p>Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141169612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1007/s12652-024-04752-8
George Routis, George Katsouris, Ioanna Roussaki
{"title":"Cryptography-based location privacy protection in the Internet of Vehicles","authors":"George Routis, George Katsouris, Ioanna Roussaki","doi":"10.1007/s12652-024-04752-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04752-8","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141116601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}